PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Title: 



METHOD FOR THE DEVELOPMENT OF AN HIV VACCINE 



Art Unit: 
Examiner: 
Docket No.: 



Appl. No.: 
Applicant: 
Filed: 



Customer No.: 
Confirmation No. 



10/667,534 
Adan Rios 
09/22/2003 
1648 

Parkin, Jeffrey S. 

RIOS:004USC2 

32425 

9949 



CERTIFICATE OF ELECTRONIC TRANSMISSION 
37 C.F.R. §1.8 



I hereby certify that this Appeal Brief is being electronically 
filed with the United States Patent and Trademark Office 
via EFS-Web on the date belowj — ■— — 





vis M.Wohlers 



APPEAL BRIEF 



MAIL STOP APPEAL BRIEF - PATENTS 

Commissioner for Patents 

P. O. Box 1450 

Alexandria, VA 22313-1450 

Dear Sir: 

Appellant submits this Appeal Brief to the Board of Patent Appeals and Interferences in 
response to the Office Action dated June 15, 2007. Appellant filed a Notice of Appeal and 
Request for Pre-Appeal Brief Review on October 12, 2007. The Notice of Panel Decision from 
Pre-Appeal Brief Review was mailed November 23, 2007. Thus, the deadline for filing the 
Notice of Appeal was December 23, 2007. A request for a one-month extension of time is 
included, which brings the deadline for filing the Appeal Brief to January 23, 2008. The fees for 
filing the Appeal Brief and for the two-month extension of time are included . Should any 
additional fees under 37 C.F.R. §§ 1.16 to 1.21 be required for any reason relating to the 
enclosed material, or should an overpayment be included herein, the Commissioner is authorized 
to deduct or credit said fees from or to Fulbright & Jaworski L.L.P. Account No.: 50- 
1212/RIOS:004USC2. 

55190496.1 / 10311652 - 1 - 



TABLE OF CONTENTS 



Page 



APPEAL BRIEF 

I. REAL PARTY IN INTEREST 

II. RELATED APPEALS AND INTERFERENCES ... 

III. STATUS OF THE CLAIMS 

IV. STATUS OF AMENDMENTS 

V. SUMMARY OF CLAIMED SUBJECT MATTER. 



VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 2 

VII. ARGUMENT 2 

A. Claims 40-50 2 

B. Claims 43-47 6 

VIII. APPENDIX A - APPEAL CLAIMS 8 

IX. APPENDIX B - EVIDENCE APPENDIX 9 

X. APPENDIX C - RELATED PROCEEDINGS 11 



I. REAL PARTY IN INTEREST 

The real party in interest is the assignee, Photoimmune Biotechnology, Inc. 

II. RELATED APPEALS AND INTERFERENCES 

There are no related appeals or interferences. 

III. STATUS OF THE CLAIMS 

Claims 40-50 are currently pending and are rejected. Claims 1-39 have been canceled. 
The rejection of claims 40-50 is being appealed. 

IV. STATUS OF AMENDMENTS 

No amendments are pending. 

V. SUMMARY OF CLAIMED SUBJECT MATTER 1 

Independent claim 48 is directed to a method of eliciting an immune response 
(Specification, p. 13, In. 4-6) comprising: obtaining a viral particle comprising a reverse 
transcriptase that has been inactivated by binding said reverse transcriptase with one or more 
azido-labeled compounds and then irradiating said reverse transcriptase (Specification, p. 11, In. 
28-29; p. 12, In. 8-9; p. 16, In. 20-23); and administering the viral particle to a subject, wherein 
an immune response is elicited in the subject (Specification, p. 13, In. 4-6). 

Dependent claim 43 is directed to the method of claim 48, wherein the viral particle is an 
HIV particle (Specification, p. 1 1, In. 7-10). Claim 44 depends from claim 43 and specifies that 
the HIV particle is HIV-1 (Specification, p. 12, In. 2). Claim 45 depends from claim 44 and 
specifies that the HIV-1 is Group M or Group O (Specification, p. 12, In. 2-3). Claim 46 

1 Parentheticals citing to support in the specification for the claim language are exemplary and not meant to indicate 
that the specific citations are the only support in the specification for the claim language. 
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depends from claim 45, and specifies that Group M is selected from the group consisting of clade : ; 
A, clade B, clade C, clade D, clade E, clade F, clade G, clade H, and clade I (Specification, p. 12, 
In. 3-5). Claim 47 depends from claim 45, and specifies that the Group M particles are clade B 
particles (Specification, p. 12, In. 5-6). 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

Claims 40-50 are rejected under 35 U.S.C. § 112, first paragraph, as containing subject 
matter not reasonably described in the specification. 

VII. ARGUMENT 

A. Claims 40-50 

The only issue on appeal is whether current claims 40-50 are supported by adequate 
written description in the specification as required by 35 U.S.C. § 112, first paragraph. The 
examiner alleges that the specification does not provide adequate written description of the j 
recitation "a viral particle comprising a reverse transcriptase" in claim 48. Rather, the examiner 
contends that the claims should be limited to an HIV particle comprising an HIV reverse 
transcriptase (RT). The examiner's position is based on a legally incorrect application of the 
written description requirement of 35 U.S.C. § 1 12, first paragraph. 

The present specification teaches that a reverse transcriptase may be inactivated by 
binding the reverse transcriptase with one or more azido-labeled compounds and then irradiating 
it {see e.g., p. 12, In. 8-9). While HIV is used to exemplify the teachings in the specification, the 
specification specifically states that "the methodology of the present invention is applicable to \ 
any retrovirus which may be associated with any animal or human disease as a method for 
development of effective immunogens and preventative vaccines. Thus, the present invention 
has a broader applicability than the exemplified HIV vaccine" (p. 16, In. 20-23). The 
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examiner is improperly attempting to limit Appellant's claims to a preferred embodiment when 
the specification's disclosure explicitly states a broader application of the disclosed methods. 

To satisfy the written description requirement, a patent specification must describe the 
claimed invention in sufficient detail that one skilled in the art can reasonably conclude that the 
inventors had possession of the claimed invention. Vas-Cath, Inc. v. Mahurkar, 935 F.2d 1555, 
1563 (Fed. Cir. 1991). In addition to the express statement in the specification that "the present 
invention has a broader applicability than the exemplified HIV vaccine" (p. 16, In. 20-23), one 
skilled in the art would have reasonably concluded that the inventor had possession of the 
currently claimed method for at least the following reasons. 

As explained in the specification, HIV is a retrovirus and a unique aspect of retrovirus 
replication is the conversion of a single-stranded RNA from the virus genome into a double- 
stranded DNA molecule that must integrate into the genome of the host cell prior to the synthesis 
of viral proteins and nucleic acids (Specification, p. 3, In. 4-12). Accordingly, all retroviruses 
possess a reverse transcriptase enzyme, which converts the RNA of their genetic material into 
DNA (Specification, p. 3, In. 14-16). Furthermore, since all reverse transcriptases prime the 
synthesis of new DNA from tRNA, which is a molecule with abundant secondary structure 
strongly associated with the enzyme, it is generally accepted that the catalytic unit among reverse 
transcriptases is phylogenetically conserved (see e.g., Flavell, Retroelements, reverse 
transcriptase and evolution, Comp. Biochem. Physiol, vol 110B,N01 pp3-15, 1995 (Evidence 
Appendix - Exhibit 1); Boeke, The unusual phylogenetic distribution of retrotransposons: A 
hypothesis, Genome Res. 2003 13:1975-1983 (Evidence Appendix - Exhibit 2); Nakamura et al, 
Telomerase catalytic subunit homologs from fission yeast and human, Science vol.277, August 
15 1997 (Evidence Appendix - Exhibit 3); Springer et al, Phylogenetic relationships of reverse 
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transcriptase and Rnase H sequences and aspects of genome structure in the gypsy group of 
retrotransposons, Mol Biol. Evol. 10(6): 1370- 1379, 1993 (Evidence Appendix - Exhibit 4); 
Lingner et al, Reverse transcriptase motifs in the catalytic subunit of telomerase, Science 
276:56 1( 1997) (Evidence Appendix - Exhibit 5); Valverde-Garduno et al, Functional analysis 
of HIV- 1 reverse transcriptase motif C: site directed-mutagenesis and metal cation interaction, J. 
Mol Evol 1998 Jul; 47(l):73-80 (Evidence Appendix - Exhibit 6); Seifarth et al, Rapid 
identification of all known retroviral reverse transcriptase sequences with a novel versatile 
detection assay, AIDS Research and Human Retroviruses, vol. 16 Number 8, pp 721-729, 2000) 
(Evidence Appendix - Exhibit 7). 

Since retroviruses cannot integrate into the genetic machinery of the host cell without 
reverse transcription, the inhibition of reverse transcriptase has as a universal consequence on the 
inability of any retrovirus to integrate within the genetic machinery of a suitable host cell. Thus, 
regardless of the type of retrovirus, the inactivation of reverse transcriptase as described in the 
present specification would be understood by a person of ordinary skill in the art to be applicable 
to any retrovirus. The importance of RT to retroviruses in general, is further evidenced by the 
number of known anti-retroviral compounds that interfere with RT activity (e.g., AZT, 
nevirapine, pyridinones, carboxanilides) (Specification, p. 3, In. 23 to p. 4, In. 8). 

As described in the present specification, a reverse transcriptase may be inactivated by 
binding the reverse transcriptase with one or more azido-labeled compounds and then irradiating 
it {see e.g., p. 12, In. 8-9). Numerous compounds that bind to reverse transcriptases were known 
in the art (see e.g., Specification, p. 3, In. 23, to p. 4, In. 11). Furthermore, in view of the 
phylogenetic conservation among reverse transcriptases (see Evidence Appendix - Exhibits 1-7) 
and the fact that all reverse transcriptases prime the synthesis of new DNA from a RNA 
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molecule with abundant secondary structure strongly associated with the enzyme, those of 
ordinary skill in the art would have known that a compound that binds to one RT would 
generally be expected to bind to other RTs as well. For example, many anti-retro viral 
compounds are nucleoside analogs, such as AZT, which inhibit the reverse transcriptase by 
competing with the naturally occurring deoxynucleotides needed to synthesize the viral DNA. 
All reverse transcriptases need deoxynucleotides to synthesize viral DNA; thus, a person of 
ordinary skill in the art would have understood that a nucleoside analog that competed with the 
naturally occurring deoxynucleotides to inhibit one RT would also be expected to bind to and 
inhibit other RTs, and that the inhibition of RT has the universal consequence of inhibiting the 
ability of retroviruses to integrate into the DNA of a host cell and replicate. 

As mentioned above, numerous compounds known to bind reverse transcriptases were 
known in the art and are disclosed in the specification (Specification, p. 3, In. 27, to p. 4, In. 1; 
pg. 12, In. 14-18; p. 21, In. 21-26). The specification teaches that these compounds may be 
converted to azido photoaffinity labels and utilized for the inactivation of reverse transcriptase 
(Specification, p. 21, In. 21-28). Thus, it would have been understood by a person of ordinary 
skill in the art at the time of filing that the inactivation of reverse transcriptase as described in the 
present specification would be generally applicable to viral particles comprising a reverse 
transcriptase. 

In view of the above, the present specification describes the claimed invention in 
sufficient detail that one of ordinary skill in the art can reasonably conclude that Appellant had 
possession of the claimed invention at the time of filing. Appellant, therefore, requests that the 
Board overturn this rejection. 



55190496.1 



5 



B. Claims 43-47 

Appellant separately argues dependent claims 43-47. The examiner stated that claims 
directed towards "a method of eliciting an immune response by administering a human 
immunodeficiency virus (HIV) particle comprising an HIV reverse transcriptase (RT) that has 
been inactivated" are supported by the specification (Action, p. 5). Dependent claim 43 is 
directed to the method of claim 48, wherein the viral particle is an HIV particle. Claim 44 
depends from claim 43 and specifies that the HIV particle is HIV-1. Claim 45 depends from 
claim 44 and specifies that the HIV-1 is Group M or Group O. Claim 46 depends from claim 45, 
and specifies that Group M is selected from the group consisting of clade A, clade B, clade C, 
clade D, clade E, clade F, clade G, clade H, and clade I. Claim 47 depends from claim 45, and 
specifies that the Group M particles are clade B particles. 

In rejecting a claim under the written description requirement, the examiner has the initial 
burden of presenting evidence or reasons why a person skilled in the art would not recognize in 
an applicant's disclosure a description of the invention defined in the claims. Here, the examiner 
has not only failed to establish a lack of written description for claims 43-47, he has expressly 
stated that the claims are supported by adequate written description. Accordingly, the rejection 
of claims 43-47 should be overturned. 
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(512) 536-5654 (telephone) 
(512) 536-3035 (facsimile) 

Date: January 23, 2008 



Travis M. Wohlers 
Reg. No. 57,423 
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VIII. APPENDIX A - APPEAL CLAIMS 



40. The method of claim 48, wherein said azido-labeled compound is azido 
dipyridodiazepinona or A^-[4-chloro-3-(3-methyl-2-butenyloxy)phenyl]-2-methyl-3- 
furanocarbothiamide . 

41 . The method of claim 48, wherein said azido-labeled compound is jV-[4-chloro-3-(3- 
methyl-2-butenyloxy)phenyl]-2-methyl-3-furanocarbothiamide. 

42. The method of claim 48, wherein the irradiation is with UV light. 

43. The method of claim 48, wherein the viral particle is an HIV particle. 

44. The method of claim 43, wherein said HIV particle is HIV-1 . 

45. The method of claim 44, wherein said HIV-1 is Group M or Group O. 

46. The method of claim 45, wherein said Group M are selected from the group consisting of 
clade A, clade B, clade C, clade D, clade E, clade F, clade G, clade H, and clade I. 

47. The method of claim 45, wherein said Group M particles are clade B particles. 

48. A method of eliciting an immune response comprising: 

obtaining a viral particle comprising a reverse transcriptase that has been inactivated by 
binding said reverse transcriptase with one or more azido-labeled compounds and 
then irradiating said reverse transcriptase; and 

administering the viral particle to a subject, wherein an immune response is elicited in the 
subject. 

49. The method of claim 48, wherein the subject is human. 

50. The method of claim 49, further defined as a method of vaccination. 
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INVITED REVIEW 

Retroelements, reverse transcriptase and evolution 

Andrew J. Flavell 

Department of Biochemistry, Medical Sciences Institute, The University, Dundee DDI 4HN, 
U.K. 

Retroelements are genetic elements that can exist as DNA or RNA or DNA/RNA duplexes. 
Although retroviruses are the best known retroelements, there are many other types, including 
close relatives of retroviruses like LTR retrotransposons, more distant relatives like non-LTR 
retrotransposons, caulimoviruses and hepadnaviruses and elements with virtually no 
similarity, like retrons. Virtually all retroelements are 'selfish DNAs' with no involvement 
with the normal development or maintenance of their host cells, the only known exception 
being telomereres/telomerases which maintain the ends of chromosomes. Virtually all 
retroelements use tRNA, or RNA with strong secondary structure, to initiate their reverse 
transcription. The coincidence between the use of tRNA, a molecule central to the conversion 
of RNA to protein, with reverse transcriptase, an enzyme which is crucial for the conversion 
of RNA to DNA is striking, because RNA probably preceded DNA and protein in evolution. 
It seems plausible that retroelements were present at the genesis of living systems. 

Key words: Retrovirus; Reverse transcriptase; Retrotransposon; Retron; Retroelement; 
Evolution; Telomerase; Caulimovirus; Hepadnavirus; Pararetrovirus. 

Comp. Biochem. Physiol. HOB, 3-15, 1995. 



Introduction 

This review is intended to give the reader an 
overview of the many apparently diverse 
manifestations of genetic elements contain- 
ing reverse transcriptase. What emerges 
from a closer look at these elements is the 
striking similarity between many of them, 
suggesting that many, and perhaps all these 
elements share a common evolutionary 
origin. 



Correspondence to: A. J. Flavell, Department of 
Biochemistry, Medical Science Institute, The Uni- 
versity, Dundee DDI 4HN, U.K. 

Received 6 January 1994; accepted 10 June 1994. 



The Origin of Reverse Transcriptase 

RNA has an intrinsic catalytic ability to 
make and break its own phosphodiester 
backbone. We, therefore, believe that RNA 
was probably the first self-replicating entity 
and evolution first worked on it before 
DNA and protein were brought into the 
picture (Cech and Bass, 1986; Darnell and 
Doolittle, 1986). 

Although RNA was probably the first 
genetic material, it is poorly suited to that 
role because it is chemically labile. DNA is 
much more inert and better suited to carry- 
ing genetic information between gener- 
ations. RNA can be converted to DNA by 
reverse transcriptase, an enzyme which is 
related in sequence to RNA replicases 
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Table I Types of retroelement 
Telomeres/telomerase 
Group II irttrons 
Retrons 

Fungal mitochondrial plasmids 
Non-LTR retrotransposons 
LTR retrotransposons 
Retroviruses 

Caulimoviruses and hepadnaviruses 



which exist in RNA viruses (Xiong and 
Eickbush, 1990). We believe that reverse 
transcriptase evolved from an RNA repli- 
case and this event was central to the 
development of DNA as the main form of 
genetic material in living organisms. 

Once reverse transcriptase had accom- 
plished this feat, it largely bowed out of the 
main story of evolution, leaving DNA as 
the genetic store and RNA as either the 
message for production of proteins (mes- 
senger RNA) or part of the machinery of 
RNA splicing, polyadenylation, etc, cata- 
lysed by ribonucleoprotein complexes, and 
translation, catalysed by transfer RNAs 
and ribosomal RNAs. But it did not disap- 
pear entirely and has been found still in a 
wide variety of guises. These 'retroelements' 
are listed in Table 1 and I will first review 
them in ascending order of sophistication, 
concentrating mainly on what we know of 
their replication cycles, before discussing 
the evolutionary implications of these data. 

Telomeres and Telomerases 

Telomeres are the extreme ends of linear 
chromosomes. Linear chromosomes are 
ubiquitous (as far as we know) in nuclear 
eukaryotes genomes and they face the same 
dilemma found by all linear DNAs, namely, 
how to resist agents which lead to shorten- 
ing, such as attack by DNA exonucleases or 
the inherent inability of DNA polymerase 
to copy its template to its extreme 5' end. 
The basic solution to the problem for the 
majority of eukaryotes is to locate a simple 
sequence at the telomere which counteracts 
the reduction in size by replicating extra 
copies of itself (Blackburn, 1992; Schippen, 
1993). The exact sequences of the repeats 
within telomeres are species-specific, but a 
typical example is that of Tetrahymena 
thermophila, in which the sequence 
GGGGGTT is repeated many times. 



The template for these extra copies is an 
RNA molecule which is bound tightly to a 
specialised type of reverse transcriptase 
called telomerase. In this case, the template 
RNA carries a homologous sequence 
(5AACCCCAA3') which is used to gener- 
ate a new DNA strand (Fig. 1). 

There is still very little known about the 
telomerase enzyme which carries out this 
function, because it is difficult to work with. 
An exciting possibility is that the RNA 
template itself is directly involved in the 
catalytic process, making it another RNA 
enzyme (ribozyme). However, this hypoth- 
esis remains untested and no ribozyme to 
date has been shown to catalyse reactions 
on DNA. Perhaps RNA never learned this 
feat and DNA has only ever been 'geneti- 
cally manipulated' by proteins. 

Group II Introns 

Two revolutionarily ancient types of in- 
tron survive in the organelles or plasmids of 
lower eukaryotes. These group I and II 
introns have the ability to splice themselves 
out of their precursor mRNAs, without the 
help of proteins (Cech and Bass, 1986). The 
two groups are classified by their character- 
istic sequence motifs which themselves 
define group-specific conserved secondary 
structures in the intron RNA. Several of 
these introns encode 'maturase' genes which 
aid the splicing process in the cell, although 
the polypeptides encoded by these genes are 
not essential for self-splicing in the test tube 
(Carignani et al, 1983). A variety of 
polypeptides are used, these are derivatives 
of enzymes concerned with RNA metab- 
olism (Lambowitz and Perlman, 1990). 

Polypeptides resembling aminoacyl 
tRNA synthetases are common and reverse 
transcriptase-like proteins are also found. 
These reverse transcriptase-like proteins 
have not yet been shown to have enzymic 
activity but another strange property of 
group II introns suggests that this may be 
the case. Some group II introns occasion- 
ally transpose to new chromosomal lo- 
cations (Mueller et al, 1993; Sellem et al., 
1993). These new locations are sometimes 
non-homologous to pre-existing insertion 
sites for the introns, suggesting that this is 
true transposition and not a phenomenon 
related to homologous recombination. 
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5' _ GGGGTTGofT 

A CCCCAAC^ 



I Elongation a 



V Translocation of template RNA 



- GGGGTTQGGGTTGoh 

A CCCCAAC^ 



DNA is shown in black 




Fig. 2. rasDNA. Base pairs involved in secondary structure are shown. DNA is shown in black 
and RNA in red. 
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Fig. 3. Biosynthesis of msDNA from its retron template. DNA is shown in black and RNA 
in red. Initiation of reverse transcription is from the 2' OH group of a G base (indicated by *). 



Retroelements, reverse ti 

While other models are possible, it is feas- 
ible that the transposition occurs in the 
following way. First, an RNA copy of the 
intron becomes inserted into another RNA 
by a reversal of the normal splicing mech- 
anism (such reverse-splicing has already 
been demonstrated in group II introns; 
Augustin et ai, 1990). Then the novel in- 
tron-containing RNA is reverse transcribed 
into DNA which then either recombines 
into the chromosomal DNA directly or by 
gene conversion. 
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Retrons 

The bizarre RNA-DNA chimaera shown 
in Fig. 2 is found in many bacteria (Inouye 
and Inouye, 1993). This small extrachromo- 
somal molecule, called msDNA, is syn- 
thesized in bacterial strains containing a 
genetic element called a retron (Fig. 3). A 
retron minimally consists of the DNA tem- 
plate for msDNA synthesis, plus a reverse 
transcriptase gene. A single promoter 
transcribes the entire genetic element, 
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Non LTR retrotransposon 



Tyl-copia group retrotransposon 



Retrovirus and gypsy group retrotransposon 

Fig. 5. Gene structural features of retrotransposons and retroviruses. See text for functions 
of genes. 



producing a long transcript from which 
reverse transcriptase is synthesized. The 
transcript can adopt a secondary structure, 
because it contains several regions of self- 
complementarity. This folded RNA then 
acts as a template for reverse transcription 
by the retron-encoded enzyme. This syn- 
thesis seems to be primed from a 2' OH 
group (starred in Fig. 3), unlike all other 
known reverse transcriptases, RNA poly- 
merases and DNA polymerases, which 
prime from 3' OH groups. The reverse 
transcription becomes stalled at one of 
the RNA hairpin loops. Degradation of 
most of the RNA part of the resulting 
heteroduplex leads to the mature msDNA 
molecule. 

What is the point of this strange phenom- 
enon? Only a proportion of the members of 
a retron-containing bacterial species actu- 
ally contain retrons. We are therefore confi- 
dent that this genetic element is a kind of 
'selfish DNA' (Doolittle and Sapienza, 
1980; Orgel and Crick, 1980; Sapienza and 
Doolittle, 1981) which is non-essential for 
the host and whose prime function is its 
own propagation. It seems unlikely that 
msDNA is an intermediate in an extrachro- 
mosomal replication cycle of retrons, be- 
cause large parts of the complete retron are 
missing from it (including the reverse tran- 
scriptase gene) but it may be the abortive 
descendant of such an intermediate. We 
shall see below that most retroelements use 
reverse transcription to replicate themselves 
via extrachromosomal intermediates. 



Fungal Mitochondrial Plasmids 

Certain fungi (notably some Neurospora 
species) sometimes contain small circular 
double stranded plasmids in their mito- 
chondria. The best characterized of these 
(the Mauriceville and Varkud plasmids) 
have been shown to contain reverse tran- 
scriptase genes and to be replicated by a 
transcription-reverse transcription cycle 
(Fig. 4). Transcription of the plasmid yields 
an RNA of exactly the same size as the 
plasmid, with an intriguing secondary 
structure at the 3' end which is reminiscent 
of tRNAs and the 3' ends of RNA viruses 
of plants and bacteria; indeed, several bases 
conserved among tRNAs (shown by small 
black circles in the figure) are found in the 
plasmid RNA (Fig. 4a). In vitro studies 
using the purified mitochondrial enzyme 
has shown that reverse transcription can be 
initiated in three different ways (Wang and 
Lambowitz, 1993). The first involves 
elongation from the 3' end of the RNA, 
perhaps via the secondary structure shown 
in Fig. 4b. The second way is by elongation 
of a short, non-specific DNA oligonucle- 
otide which is bound to the reverse tran- 
scriptase (Fig. 4c). This time DNA synthesis 
begins at the penultimate base of the RNA. 
The final way (Fig. 4d) is identical to the 
second, except that no exogenous DNA 
primer is required, just a G base. dGMP, 
dGDP and dGTP all function in this in vitro 
reaction. It is unclear which, if any of these 
methods of reverse transcription initiation 
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predominates in vivo. At present, we do not 
know how the full DNA-RNA hybrid 
which is synthesized as the first step in the 
reverse transcription process, is eventually 
converted into the circular double stranded 
plasmid, 

Non LTR Retrotransposons 

Retrotransposons are transposable gen- 
etic elements which use reverse transcrip- 
tion during their movement to new 
chromosomal locations. Non-LTR retro- 
transposons are the simplest type of retro- 
transposon. They are found in the genomes 
of the majority of eukaryotes (including 
mammals, where they are called LINE, LI 
or Kpn elements; Hutchinson el ai, 1989; 
Martin, 1991). They contain two discernible 
genes, one of which is a reverse transcrip- 
tase/RNAse H gene (rt RNAseH; Fig. 5). A 
single transcript, initiating at the exact 5' 
boundary of the retroelement, encodes all 
the genetic information of the element 
(Mizrokhi et al., 1988). The details of the 
transposition cycle are still unclear but a 
particle, comprised of proteins encoded by 
the other gene in the element (marked by ? 
in the figure) and containing the retrotrans- 
poson's RNA and reverse transcriptase, has 
been identified and is perhaps involved in 
the reverse transcription and integration 
process, just as the analogous protein- 
aceous particles are implicated in the trans- 
position cycles of LTR retrotransposons 
and retroviruses (see below; Deragon et al, 
1990; Martin and Branciforte, 1993). The 
exact reverse transcription mechanism for 
non-LTR retrotransposons is still unclear, 
but is believed that the reverse transcript 
becomes inserted at random nicks in the 
chromosome (Finnegan, 1989). 

LTR Retrotransposons and Retro- 
viruses 

These two types of retroelement are so 
similar that I will consider them together. 
The gene structures of retrotransposons 
and the DNA forms of retroviruses (called 
proviruses) are shown in Fig. 5. There are 
two main groups of LTR retrotransposons, 
the Tyl-copia group and gypsy group, 
named after prototype elements in yeast 



and Drosophila, respectively (Bingham and 
Zachar, 1989). The former is structurally 
the more simple group. Both groups are 
found in fungi, plants and insects. There 
has been some debate about the existence of 
retrotransposons in vertebrates. Early 
claims were probably no more than defec- 
tive retroviruses (Hodgson et al., 1990) but 
recent comprehensive PCR-based searches 
have identified Tyl-copia group retrotrans- 
posons in fish, amphibia and reptiles 
(though not mammals and birds as yet; 
Flavell and Smith, 1992; Flavell et al., 
1994). 

LTR retrotransposons are known to use 
intracellular ribonucleoprotein particles as 
intermediates in their transposition cycles 
(Shiba and Saigo, 1983; Garfinkel el al., 
1985). The protein structural components 
of the particles are encoded by the gag 
genes (see Fig. 5). The particles also contain 
virtually full length transcripts of the trans- 
posons and several enzymes, all of which 
are retrotransposon-encoded. A protease 
(encoded by the pr gene) is involved in 
cleaving the precursor polyprotein into the 
mature proteins, a reverse transcriptase first 
copies the transposon RNA to form an 
RNA-DNA duplex. For all LTR retro- 
transposons and retroviruses, initiation of 
the reverse transcription is from a 3' end of 
tRNA physically bound to the template and 
the enzyme (Bingham and Xachar, 1989; 
Varmus and Brown, 1989). The next step is 
the degradadion of the RNA in the duplex 
by ribonuclease H to enable synthesis of the 
second DNA strand by the reverse tran- 
scriptase. Finally, an integrase (encoded by 
the int gene) catalyses the insertion of the 
double-stranded DNA copy into the 
chromosome. 

Retroviruses use just the same genes to 
achieve the same result as LTR retrotrans- 
posons (Varmus and Brown, 1989). The 
only significant functional difference be- 
tween the two is the retroviruses proven 
ability to leave the cell as a virus particle. 
Entry of virus particles into a new cell 
requires an envelope glycoprotein, encoded 
by the env gene (Fig. 5), which is embedded 
in the plasma membrane envelope acquired 
by the virus when it buds out through the 
cell membrane. In fact, gypsy group retro- 
transposons are believed by some to actu- 
ally be retroviruses. They possess an extra 
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gene of unknown function in the same 
location as env and there is some evidence 
to suggest that the gypsy transposon of 
Drosophila can form an infections particle 
(Kim et ai, 1994). 

Hepadnaviruses and Caulimoviruses 
(Pararetroviruses) 

Two other virus groups use reverse tran- 
scription in their life cycles, though neither 
normally integrate their DNAs into the 
host chromosomes. Hepadnaviruses (hepa- 
titis B-like viruses) infect vertebrates. They 
contain a small circular DNA molecule 
which is partially single-stranded (Fig. 6; 
Tiollais et al, 1981). This DNA encodes 
genes specifying a reverse transcriptase, 
capsid structural components and viral sur- 
face proteins. Upon infection, the virion 
DNA is filled in by the encapsidated reverse 
transcriptase to form a closed circular 
double stranded DNA. Transcription of 
this template yields an RNA which is trans- 
lated into the virus-encoded proteins (Sum- 
mers and Mason, 1982). Initiation of 
reverse transcription is primed by a protein 
bound to the virion RNA, unlike all other 
retroelements which use RNA primers 
(Gerlich and Robinson, 1980). Reverse 
transcription commences in the particle but 
remains incomplete, forming the partially 
single-stranded virion DNA. 

Caulimoviruses are plant viruses which 
share with hepadnaviruses the properties of 
encapsidation of an incompletely reverse 
transcribed DNA (Bonneville et al, 1988). 
In this case, the capsid nucleic acid is largely 
double-stranded with a few nicks (Fig. 6). 
The basic steps of transcription from an 
extrachromosomal closed circular DNA 
template into RNA which is translated into 
reverse transcriptase and virus particle 
components are shared with hepadaviruses. 
Priming of caulimovirus reverse transcrip- 
tion uses a tRNA, just as LTR retrotrans- 
posons and retroviruses do. 

The Evolution of Retroelements 

From the above survey, it is evident that 
there is a wide variety of genetic elements 
which use reverse transcriptase for their 
propagation in a broad spectrum of living 



organisms from bacteria to man. I said at 
the outset that we believe the process of 
reverse transcription to be evolutionarily 
ancient. Can we assemble all the known 
manifestations of reverse transcription and 
the genetic elements involved with this pro- 
cess into an evolutionary tree? In some 
cases, this is quite easy (Fig. 7). Retro- 
viruses and LTR retrotransposons are obvi- 
ously related and the more complex gene 
structure of the former, plus phylogenetic 
comparisons of the DNA sequences of these 
elements (Temin, 1980; Xiong and Eick- 
bush,. 1990) suggests strongly that LTR 
retrotransposons were the ancestors of 
retroviruses (Fig. 7). Judged by sequence 
homology and structural similarity, the 
most likely immediate progeniter of retro- 
viruses was a gypsy group LTR retrotrans- 
poson. Tyl-copia group retrotransposons, 
with their simpler gene structure, probably 
arose before the gypsy group, though 
whether they were the direct ancestor is 
unclear. 

The hepadnaviruses and caulimoviruses 
are more difficult to fit into this picture. 
Phylogenetic analysis suggests that 
caulimoviruses evolved from gypsy group 
LTR retrotransposons (Doolittle et al., 
1989; Xiong and Eickbush, 1990), but hep- 
adnaviruses are highly diverged from both 
groups. Temin has proposed that both 
viruses evolved from retroviruses by loss of 
the ability of the extrachromosomal DNA 
to integrate into the host chromosome 
(Temin, 1989). Xiong and Eickbush suggest 
that hepadnaviruses arose from a recombi- 
nation event between a pre-existing RNA 
virus and a primitive retrotransposon while 
caulimoviruses derived in a similar manner 
from gypsy group retrotransposons. Both 
models are plausible but the latter is more 
likely, at least in the case of hepadnaviruses, 
which have a priming mechanism for the 
initiation of replication which differs from 
retroviruses and resembles some RNA 
viruses, such as poliovirus. 

What about the more primitive retroele- 
ments? Non-LTR retrotransposons may be 
the ancestors of LTR retrotransposons, be- 
cause of their simpler construction, less 
sophisticated transposition mechanism and 
ubiquity in the eukaryotes (Doolittle et al, 
1989; Xiong and Eickbush, 1990). The evol- 
utionary status of the other elements 
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mentioned here is still very unclear 
(Doolittle et al., 1989; Xiong and Eickbush, 
1990). Sequence comparisons between the 
reverse transcriptases argue that retrons 
and the fungal mitochondrial plasmids are 
grouped together, with group II introns 
being the nearest relatives to these se- 
quences and non-LTR retrotransposons the 
next nearest. Additionally, some non-LTR 
retrotransposons transpose to the telomeric 
regions of Drosophila chromosomes 
(Biessman et al, 1992; Levis et al., 1994) 
suggesting an evolutionary link with 
telomerases. All this implies that the best 
candidate for the progenitor of all these 
elements belongs to an ad hoc group 
containing non-LTR retrotransposons, 
retrons, telomerases, fungal mitochondrial 
plasmids and group II introns, though it is 
impossible to say which, if any, came first. 

Two properties unite virtually all 
retroelements, suggesting that they are fun- 
damental to these elements and were pre- 
sent at their genesis. Firstly, virtually all 
reverse transcriptases prime the synthesis of 
new DNA from an RNA molecule with 
abundant secondary structure which is 
strongly associated with the enzyme. In the 
large majority of cases this is a tRNA. 
tRNAs themselves play a central role in 
living systems as the key component in the 
conversion of RNA to protein. RNA was 
probably the original genetic material and 
transfer RNAs and reverse transcriptase are 
the fundamental components of the ma- 
chinery needed to synthesize DNA and 
protein, respectively, from RNA. The close 
association between the two in most 
retroelements to this day is striking and 
seems to this author a potent argument for 
this model of early evolution. 

The second property uniting retroele- 
ments is their lack of any obvious advan- 
tage to their cellular hosts. With the single 
exception of telomerases, all reverse tran- 
scriptases are involved in the propagation 
of genetic elements which are not involved 
in the day-to-day functioning of cells. In 
fact, some retroelements are dangerous 
parasites (retroviruses, caulimoviruses and 
hepnaviruses). Thus, even though their 
origin probably lies at the dawn of cellular 
life, these elements have remained aloof 
from the business of enabling a cell to 
survive and replicate in an environment. 
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The Unusual Phylogenetic Distribution of 
Retrotransposons: A Hypothesis 

Jef D. Boeke 
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Retrotransposons have proliferated extensively in eukaryotic lineages; the genomes of many animals and plants 
comprise 50% or more retrotransposon sequences by weight. There are several persuasive arguments that the 
enzymatic lynchpin of retrotransposon replication, reverse transcriptase (RT), is an ancient enzyme. Moreover, the 
direct progenitors of retrotransposons are thought to be mobile self-splicing introns that actively propagate 
themselves via reverse transcription, the group 11 introns, also known as retrointrons. Retrointrons are represented in 
modern genomes in very modest numbers, and thus far, only in certain eubacterial and organellar genomes. 
Archaeal genomes are nearly devoid of RT in any form. In this study, I propose a model to explain this unusual 
distribution, and rationalize it with the proposed ancient origin of the RT gene. A cap and tail hypothesis is 
proposed. By this hypothesis, the specialized terminal structures of eukaryotic mRNA provide the ideal molecular 
environment for the lengthening, evolution, and subsequent massive expansion of highly mobile retrotransposons, 
leading directly to the retrotransposon-cluttered structure that typifies modern metazoan genomes and the eventual 
emergence of retroviruses. 



The Ancient Origin of Reverse Transcriptase 

There are two arguments for an ancient origin of RT. The first is 
theoretical and is based on the now widely accepted proposal 
that an RNA world preceded the form of biology with which we 
are familiar, the DNA world. Darnell first articulated that RTmust 
have been present during the time of the transition between 
these two worlds, and therefore, must be considered ancient 
(Darnell and Doolittle 1986; Fig. 1). The second argument is 
based on the fact that RT genes are very broadly distributed 
among the branches of the tree of life, and have largely (but not 
entirely) descended vertically by descent from an ancestral RT 
gene (Doolittle et al. 1989; Xiong and Eickbush 1990; Eickbush 
1994; Malik et al. 1999). Furthermore, the RT gene has seemingly 
reinvented itself in multiple and diverse forms (Boeke and Stoye 
1997). In addition to the familiar retroviruses, there are pararet- 
roviruses, which package DNA, but replicate by reverse transcrip- 
tion, two major classes of retrotransposons (described in the fol- 
lowing section), as well as a more bizarre group of elements 
found in bacteria and organellar genomes, and hence, referred to 
as the prokaryotic group. The discovery of RTs in bacteria, first in 
the form of msDNA (short for multicopy single-strand DNA) or 
retron elements (Yamanaka et al. 2002) and later in the form of 
retrointrons (Belfort et al. 2002), provided dramatic evidence in 
favor of an ancient origin for RT. 

The highly diverse tree of retroelements can be rooted in the 
prokaryotic group of elements (Eickbush 1994). The prokaryotic 
group includes three types, that is, retrons, retroplasmids, and 
retrointrons. Retrons are RT genes that produce an unusual 
branched structure called msDNA made by reverse transcription 
of a precursor RNA primed from an internal guanosine residue — 
unlike other retroelements, they have no known function or abil- 
ity to mobilize autonomously (Yamanaka et al. 2002). Thus far, 
they have been found only in a very limited subset of bacteria. 
Retroplasmids are known only from the mitochondria of certain 
fungi, replicate by reverse transcription, and exist in both circular 
and linear (hairpin) forms (Kuiper and Lambowitz 1988; Walther 
and Kennell 1999). The retrointrons, or group II introns, mobi- 
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lize or retrohome to empty target sites (unspliced versions of 
their host genes) via a very unconventional mechanism. The ex- 
cised intron lariats insert into double-stranded target DNA (cop- 
ies of the DNA containing the flanking exons but lacking the 
intron) by reversal of the normal splicing reaction, probably 
aided by the maturase activity of the RT proteins encoded by 
these elements. They are then converted into DNA by use of a 
target-primed reverse transcription (TPRT) mechanism similar to 
that used by non-LTR retrotransposons (Zimmerly et al. 1 99Sa,b; 
Yang et al. 1996). Priming is facilitated by the action of a small 
endonuclease domain of the RT that cleaves the intact strand of 
the double-stranded target DNA. 

Several independent arguments strongly suggest that the 
prokaryotic group of RT sequences is ancestral to the RT se- 
quences of retrotransposons and retroviruses. Counterarguments 
to each of these proposals exist, but as a group, these proposals 
are compelling. (1) It is a simple evolutionary paradigm that 
things evolve progressively from a simple state to an ever more 
complex one. Retrons, retroplasmids, and retrointrons all encode 
a single RT protein, often with only that enzymatic activity, 
whereas retrotransposons and retroviruses always encode mul- 
tiple enzyme activities and usually encode multiple separate pro- 
teins. These additional activities, which include proteases, zinc 
finger domains, at least three distinct types of endonucleases, 
and integrase, appear to have been recmited from eukaryotic 
host genomes at multiple times in evolution, probably using the 
same types of mechanisms used by retroviruses when they pick 
up cellular oncogenes (Telesnitsky and Goff 1997). A widely ac- 
cepted extension of this simple argument is that the retroviruses 
and pararetroviruses evolved from LTR retrotransposons by ac- 
quiring new proteins conferring the ability to efficiently leave 
and re-enter host cells, also known as horizontal transfer or lat- 
eral transfer Poolittle et al. 1989). (2) The RT of one member of 
the prokaryotic group has the ability to perform primer- 
independent synthesis, similar to RNA polymerase, the presumed 
ancestor of RT (Wang and Lambowitz 1993). (3) The RT se- 
quences of the prokaryotic group are the most similar to the 
sequences of the presumed ancestral outgroup of sequences, the 
RNA-directed RNA polymerases (RdRPs). Non-LTR retrotrans- 
posons, LTR retrotransposons, and retroviral RTs are progres- 
sively less closely related to RdRP sequences (Eickbush 1994). (4) 
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transcriptase is posited by the RNA world hypothesis (Darnell 



The sequence of telomerase, a specialized RT considered by many 
tu represent an ancient eukaiyotic enzyme, clusters with prokary- 
otic and non-LTR retrotransposon RT sequences (Eickbush 1997; 
Nakamura and Cech 1998). 

Two Types of Retrotransposons That Mobilize 
by Distinct Mechanisms 

The retrotransposons can be divided into two major groups, the 
non-LTR and the LTR retrotranposons. The mechanisms of these 
two types of retoelements are summarized briefly here and in 
Figure 2. In addition, two smaller retrotransposon families, the 
DIRS! (Goodwin and Poulter 2001) and BEL (Malik et al. 2000) 
groups, appear to be distinct, but are much less widely distrib- 
uted, and thus, will not be discussed further here. 

Of the two major retrotransposon classes, the non-LTR ret- 
rotransposons, are less well understood mechanistically, but nev- 
ertheless, a good outline of the process exists (Kazazian and 
Moran 1998). The element mRNAs are translated in the cyto- 
plasm, producing one or two proteins. One of these is a polypro- 
tein with at least two critical activities, an endonuclease, and an 
RT. Most of the non-LTR elements also encode an RNA chaper- 
one, whose role remains unclear. However, the endonudease/RT 
protein is thought to bind the element RNA to form an RNP 
complex, which then enters the nucleus. This complex then ac- 
quires a host DNA target, in which a nick is made by the endo- 
nuclease. In a remarkable target-primed reverse transcription 
(TPRT) process, the 3' OH of the cleaved target DNA primes re- 
verse transcription of the element RNA, at or near the 3' poly(A) 



end. The mechanism of the cutting of the second strand and 
second-strand synthesis is less well understood, but may well be 
symmetrical with the first, involving a second round of TPRT 
with the newly made DNA strand serving as template. 

LTR retrotransposons move via a mechanism quite similar 
to that used by retroviruses. Generally, two primary protein prod- 
ucts are made, corresponding to retroviral Gag (coat proteins) 
and the readthrough product Gag-Pol (RT and other enzymes). 
The Gag proteins together with two RNA molecules are as- 
sembled into a virus-like particle (VLP). This encapsidation may 
serve to further protect the element's genomic RNA molecules 
from degradation. Reverse transcription occurs in the VLP, and is 
primed by a cellular tRNA (Chapman et al. 1992) or retrotrans- 
poson RNA fragment (Levin 199S). The initial product of the RT 
reaction, minus strand strong-stop DNA, is transferred to the 3' 
end of the RNA in a critical step that leads to subsequent comple- 
tion of the minus strand DNA synthesis. If even relatively small 
amounts of RNA were lost exonucleolytically from the 5' or 3' 
end during this process, retrotransposition would fail. Several 
additional steps similar to those used by retroviruses, including a 
second priming event and strand transfer, lead to the final prod- 
uct of reverse transcription, a double-stranded DNA (Boeke and 
Stoye 1997; Telesnitsky and Goff 1997). RNA integrity is impor- 
tant for this process, which can take several hours to complete; 
however, a recombination-like template switching process can 
bypass damage to the element's RNA. The resulting DNA, to- 
gether with the integrase protein (processed previously by an 
element-encoded protease from the RT precursor protein Gag-Pol) 



1976 Genome Research 

www.genome.org 



Downloaded from www.genome.org on February 16, 2007 
Recrorransposons and RNA Metabolism 




AAAAAAAAAAAAAAAAAAAAAAA/V 




black lines are cDNA strands. 



The llfecycles of non-LTR (teft) and LTR retrotransposons are outlined. Wavy lines are RNA molecules; thin 



is transported to the nucleus, where it inserts via a transesterifi- 
cation reaction very similar to that used by DNA transposons 
(Mizuuchi and Baker 2002). 

Modern-Day Distribution of RT Genes 
Faced with the assumption that RT is an ancient enzyme, it be- 
comes difficult to explain the modem-day distribution of RT 
genes in the three kingdoms of life, Eubacteria, Archaea, and 
Eukarya. The majority (67%) of sequenced eubacterial species 
lack a detectable RT gene in their genome (Fig. 3). For those 
species of eubacteria that do contain RT genes, they mostly con- 
tain only one or two RT genes. The great majority of Archaea lack 
recognizable RTs altogether; the only exception to this trend, 
Methanosarcina, has a very large genome thought to have been 
formed by the incorporation of a large segment of a eubacterial 
genome as a late lateral transfer event in its evolution (Deppen- 
meier et al. 2002). This species contains a set of retrointrons 
similar to those found in eubacteria (Dai and Zimmerly 2003). In 
contrast, RT genes are found in virtually all eukaryotic genomes, 
and are generally found in 20 to >S00,000 copies per genome. 
Even when adjusted for genome size, eukaryotes contain signifi- 
cantly more RT genes. Virtually all of these are non-I,TR and/or 
LTR retrotransposons. In a recent grand synthesis, Bushman po- 
etically described eukaryotic genomes as "genes floating on a sea 
of retrotransposons" (Bushman 2002), although an astute re- 
viewer of this work points out that genes do not float, or else 
gene order colinearity would not be observed in genomic com- 
parisons. Some well-known extreme examples of this include the 



human genome, -1,000,000 non-LTR retrotransposons, SINES, 
and endogenous retroviruses (Smit 1996) and the maize genome, 
estimated to contain -200,000 copies of intact retrotransposons 
(SanMiguel et al. 1996; J. Bennetzen, pers. comm.). It is the abun- 
dance of retroelements that largely explains the C-value paradox 
in most metazoans. What led to such an abundance of RT genes? 

It could be argued that the observed discrepancy is a simple 
consequence of genome streamlining in bacterial genomes. Al- 
though there is no doubt that streamlining is a major evolution- 
ary force in both eubacteria and Archaea, one can consider as a 
control for the above conclusion the distribution of DNA trans- 
posons among the three kingdoms. DNA transposons are found 
in almost all eubacterial and archaeal genomes and typically are 
found between 10 and 1 00 copies. They are also found in eukary- 
otes, but have a somewhat spottier distribution there, being quite 
well represented in certain groups {Drosophiki, Caenorhabdith, 
Brassica), but notably absent from others {Saccharomycts, Schizo- 
sacchammyces). 

The dramatic discrepancy in retroelement distribution be- 
tween prokaryotes and eukaryotes strongly suggested to me that 
there was some special feature(s) of being eukaryotic that repre- 
sented a permissive state for RT and allowed the evolution and 
proliferation of retrotransposons. 

The Evolution of Eukaryotes and Their Retroelements 
The release of numerous eubacterial, Archaeal, and eukaryotic 
genome sequences has provided extensive fodder for models of 
how eukaryotes evolved. It is clear that we eukaryotes contain a 
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Figure 3 Bacterial genomes contain very few RT genes. A total of 96 completely sequenced 
bacterial genomes were searched by BLASTp on the comprehensive microbial resource at www. 
tigr.org. The two queries used were the LtrA RTfrom a Lactobacillus lactis group II intron (Q5700S) 
and a retron RT from Escherichia col! (P23070). The number of BLAST hits with an E value <0.001 
was tabulated for both queries, and the higher number was taken as the measure of RT gene 
number (visual inspection showed that this modestly inflated the number of RT genes as some of 
the low-scoring hits were false positives). 



mixture of genes descended from Archaeal and eubacterial an- 
cestor cells (Woese et al. 1990; Margulis 1996). The precise se- 
quence of events in the evolution of eukaryotcs has been debated 
hotly, but a consensus is developing about the major events that 
must have occurred. This consensus view will be recounted here 
briefly. 

Archaea and eubacteria were two ancient lineages of cells 
that had evolved distinct mechanisms of transcription and DNA 
replication, among other things, but otherwise shared the fun- 
damental properties of being unicellular heterotrophs. Symbiosis 
of eubacterial cells (the progenitor of the mitochondrion) and 
Archaeal cells ultimately led to a proto-eukaryote containing a 
eubacterial endosymbiont. This may have begun as a casual or 
accidental symbiosis, but at some point, provided some impor- 
tant selective advantage. Several other events followed, probably 
involving an additional cycle(s) of acquiring additional genomes 
via consumption (Taylor 1974), as well as the acquisition of a 
number of other distinctive eukaryotic features, which will be 
considered separately in the next section. These events gave rise 
to a primitive eukaryote with the recognizable nuclear genome 
and mitochondrial genome, each in a membrane-bounded com- 
partment. Acquisition of an additional photosynthetic bacterium 
by consumption led to a plant lineage, but for simplicity, this will 
not be considered further here. Because the modern eubacteria 
contain RT genes and the Archaea largely lack them, I will make 
the fairly arbitrary assumption that the same was true at the 
dawn of eukaryotes. The eubacterially derived endosymbiontfs) 
slowly transferred its genes to the nucleus of the primitive eu- 
karyotic cell, becoming ever more dependent on its host. Re- 
markably, this process of gene transfer from mitochondria to 
nucleus is functional in modern-day yeast cells, in which the 
transfer of mitochondrial gene segments to the nucleus can be 
observed experimentally (Thorsness et al. 2002). Through this 
process, RT genes present as Tetrointrons would be transferred 
readily to the nucleus through this passive and stochastic pro- 
cess. Movement of retrointrons via homing to near-cognate sites 
might well have led to a proliferation of introns and the evolu- 
tion of the splicing apparatus as an intron-removal mechanism. 
The stage was set for the evolution of retrotransposons. What 
specific features of eukaryotic cells made this possible? 



The Nuclear Membrane 

The existence of a nuclear membrane 
would appear to be an impediment and not 
a help to the evolution of retrotransposi- 
tion. The translation process occurs outside 
of the nucleus, whereas transposition hap- 
pens inside, and therefore, retrotrans- 
posons have evolved transport mechanisms 
to overcome these barriers. Therefore, the 
existence of the nuclear membrane is in- 
hibitory to successful retrotransposition. 
Linear Chromosomes 

The transition to linear chromosomes from 
the presumed ancestral circular state may 
well have provided an early opportunity for 
an RT gene to make itself indispensable to 
its host by acquiring the ability to lengthen 
telomeres, leading to the enzyme telomer- 
ase, still the major mechanism for telomere 
formation in modern eukaryotes (Naka- 
mura and Cech 1998), providing an elegant 
solution to the end-replication problem 
posed by the termini of linear DNA mol- 
ecules. However, this would provide a 
niche for but a single copy of RT. Also, a 
be made that telomerase was a relatively late 
retrotransposon (Pardue et al. 



compelling < 

acquisition, by devolution of 

1997), as telomeres could, in principle, solve the end-replication 
problem via the formation of T-loops (Griffith et al. 1999). Thus, 
linear chromosomes per se do not provide a compelling oppor- 
tunity for the evolution of retrotransposons. 

Introns 

As argued above, RT may have played an important role in the 
widespread accumulation of introns in primitive cells, although 
the timing of this event has also been the subject of much debate 
(Gilbert and Glynias 1993; Logsdon Jr. and Palmer 1994; Stoltz- 
fus 1994; Logsdon Jr. 1998; Simpson et al. 2002). However, the 
simple existence of these introns did not confer any special se- 
lective advantage on RT genes. Rather, the proliferation of in- 
trons may well be a consequence of a permissive RNA environ- 
ment that allowed them to mobilize more readily in the genome. 

Sex and Diploidy 

Donal Hickey (Hickey 1993) and Tim Eestor (Bestor 1999) have 
provided eloquent arguments that the evolution of sex and dip- 
loidy provides an opportunity for mobile elements to invade host 
species and march inexorably to fixation in the host genome, 
providing they do not decrease the fitness of their host >S0%. 
However, this argument applies to both retrotransposons and 
DNA transposons, and thus, is insufficient to explain the selec- 
tive amplification of retrotransposons in eukaryotes. 

RNA Processing Machinery 

The physical separation of the processes of transcription and 
translation and changes in gene organization (perhaps the con- 
sequence of the nascent eukaryal nuclear genome being bom- 
barded with fragments of its endosymbiont guest DNA), and 
other factors, led to important changes in the way RNA was me- 
tabolized in eukaryotic cells. The major changes were the com- 
partmentalization of single-coding regions (by and large) into 
stereotypical mRNA structures punctuated by a 5' cap structure 
and 3' poly(A) tail (Fig. 4). Not only are the structures of these 
mRNAs prominent uniquely in eukaryotic cells, but they also 
coordinate to play a critical role in eukaryotic translational ini- 
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Figure 4 The cap and tail structure of eukaryotic mRNA. 



tiation; the 5' cap and 3' poly(A) interact in the cytoplasm, ef- 
fectively circularizing the RNA. 

Other Factors 

There are likely to be many factors that control the proliferation 
of transposable elements of all types in eukaryotes. For example, 
organisms such as yeasts and Fugu, with smaller genomes, tend to 
have much higher recombination rates, and these organisms 
carry lighter transposon burdens. Ergo, it can be argued that such 
high recombination rates are inconsistent with explosive types of 
transposon amplification, such as has been seen in humans and 
maize. Very high transposon copy numbers could cause exten- 
sive secondary damage to highly recombination-proficient ge- 
nomes. Similarly, diverse mechanisms controlling the copy num- 
ber of certain transposons' activity, such as cosupprcssion of Ty 1 
elements in yeast (Jiang 2002) and RNAi in many eukaryotes 
(Ketting et al. 1999), play important roles in controlling trans- 
poson copy numbers. Whereas such factors are undoubtedly of 
considerable importance in determining transposon copy num- 
bers in individual species, they do not help to explain the general 
trend observed that eukaryotes tend to have very high retrotrans- 
poson copy numbers relative to prokaryotes. 

The cap and tail hypothesis proposes that this unique ter- 
minal structure created three special molecular opportunities for 
the evolution of retrotransposons. First, these termini created a 
very stable long-lived genomic RNA freed from the necessity to 
be highly folded. Second, this RNA stability facilitated the recom- 
binational acquisition of additional host gene modules needed 
for the formation of retrotransposons much more likely; the long 
mRNAs typical of retrotransposons and retroviruses were pro- 
tected from destruction by exonucleases. Third, these terminal 
RNA structures provided precise punctuation marks defining the 
retrotransposon termini and facilitating their reproduction with- 
out the loss of even a single terminal nucleotide. These traits set 
the stage for the evolution of elaborate and precise processes of 
reverse transcription evolved by retrotransposons. 

Why Did Eukaryotes Evolve Caps and Tails? 
A number of theories have been advanced as to the evolution of 
the cap and tail. Extensive work on the molecular biology of 
translation has shown that the 5' cap and 3' tail structures are 
directly required for initiation of translation in eukaryotes. Ad- 
ditionally, both RNA structures are protective against terminal 
degradation of the RNA. In particular, the protective role of the 5' 
cap is revealed by the eukaryotic mRNA degradation pathway; 
this process occurs in three steps, (1) 3' deadenylation, leading to 
(2) decapping, followed by (3) 5' - 3' exonuclease action (Tucker 
and I'arker 2000). Although 3' exonucleases are found in eukary- 
otic cells, they appear to play more specialized roles in mRNA 



stability, such as nonstop decay and the destabilization of spe- 
cific mRNAs (van Hoof and Parker 2002). 

I'olyadenylation occurs in all three kingdoms of life, al- 
though it only affects a subset of mRNAs in bacteria, and actually 
stimulates mRNA breakdown in prokaryotes (Steege 2000). Thus, 
certain components of the polyadenylation machinery predated 
the evolution of eukaryotes, and it appears that poly(A) simply 
acquired new functions in eukaryotes. In the literature and in 
discussions with 'colleagues, I've become aware of three theories 
regarding selective pressures leading to a need for a 5' cap. The 
first theory is that the compartmentalization of transcription 
leads to extensive opportunities for potentially inhibitory RNA 
folding prior to translation— potentially, such mRNA hairballs 
could occlude internal Shine Delgarno initiation sequences, 
whereas a terminal cap structure could more readily be recog- 
nized, like the end of a ball of yarn (Hershey and Merrick 2000). 
A second theory is that the complex nature of RNA processing in 
eukaryotes could lead to large numbers of misprocessed RNAs. 
Expression of inappropriately processed RNAs could lead to the 
expression of deleterious dominant negative protein fragments 
for example. Obviously, there are special pathways such as Non- 
sense-mediated decay (Frischmeyer and Dietz 1999; Gonzalez 
et al. 2001) and Nonstop decay, which deal with some of the RNA 
quality control Issues raised by the existence of potentially inac- 
curate splicing machinery. However, a third type of proofreading 
is conferred by the obligatory circularization of mRNA during 
translational initiation— any RNA lacking intact S ' or 3' ends will 
not be translated (R. Green, pers. comm.). 

Finally, Stewart Shuman has proposed that the cap arose to 
protect the RNA from 5' exonuclease action, and that the latter 
activity represented a type of primitive immunity against RNA 
viruses (Shuman 2002). Thus, the Xmlp S' exonuclease' may 
have arisen in response to genomic RNA invaders, and the cap- 
ping machinery evolved in parallel to protect endogenous cellu- 
lar mRNAs. It is clear that eukaryotic cells evolved a series of 
different immunity mechanisms against invading RNA genomes, 
including the interferon system (Kumar and Carmichael 1998) 
and RNAi (Ketting et al. 1999). Needless to say, if this scenario is 
correct, the primitive immunity conferred by 5' exonuclease was 
quickly evaded by viruses that acquired caps by various nefarious 
means or evolved IRES elements that bypassed the cap require- 
ment (Shuman 2002). Nevertheless, it would appear that the ac- 
quisition of the cap/5' exo strategy paradoxically set the stage for 
the evolution of a collection of internal genome invadeTS of eu- 
karyotes, and eventually, retroviruses. 

An interesting difference between bacteria and eukaryotes 
that may be related to differential RNA stability is the ability of 
eukaryotes to produce significantly longer proteins, such as the 
long polyproteins encoded by retrotransposons. Interestingly, a 
survey of bacterial genomes (Fig. 5) shows that bacteria, on av- 
erage, encode shorter proteins than eukaryotes. This discrepancy 
becomes particularly acute when the longest ORFs are examined. 
The longest ORF in Escherichia coli K12, a putative invasin at 2383 
codons, is less than half the length of the longest Saccharamyoces 
cerevisiae ORF, the MDN1 gene at 4910 codons, and pales in com- 
parison to human titin at 27,118 amino acids, encoded by an 
astonishingly long 82-kb mRNA (Labeitand Kolmerer 1995). This 
limit to ORF size does not represent an absolute expression block 
in bacteria, as some very large ORFs encoding nonribosomal 
polypeptide and polyketide synthases have been discovered in 
various bacterial species. It is possible that the simple lifestyle of 
prokaryotes generally requires shorter proteins than the complex 
lifestyle of eukaryotes. The evolution of a more stable mRNA 
structure in eukaryotes may well have contributed to the evolu- 
tion of much greater potential protein structure complexity in 
general in eukaryotes. 
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Figure S Bacteria and Archaea encode smaller proteins than eukaryotes. The number of codons 
in each ORF for the Indicated organisms was sorted in Excel, and a point was plotted for each 
protein. It can be seen readily that both the mean length and total length of the eukaryotic 
proteins are significantly higher than those of both the eubacterial and archaeal species. Results are 
typical (data not shown). 



Bacteria Are RNA-Hostile 

Recent work on the degradation of bacterial mRNAs has eluci- 
dated the basic molecular mechanisms, which are quite different 
from the eukaryotic mechanism (Table 1). In summary, eubacte- 
ria like E. coli degrade their RNAs through the combined effects of 
multiple endonucleases and 3' exonucleases; many of the rel- 
evant activities are organized in degradosomes (Steege 2000). 
One of the well-studied endonucleases, RNAse E, nicks unstruc- 
tured RNA regions adjacent to structured regions. Whereas the 
products of such nicking are not necessarily excellent direct sub- 
strates for the 3' exonucleases, addition of a 3' poly(A) tail creates 
an opportunity to initiate the degradation process by the degra- 
dosome; hence, mRNA polyadenylation leads to degradation in 
eubacteria. 

If the cap and tail hypothesis is correct, it makes a number 
of predictions— for example, intact long RNA molecules should 
be difficult to detect in bacteria. It has long been known that it is 
extremely difficult to detect bacterial mRNAs by Northern blot- 
ting, and typical measurements of bacterial RNA half-lives range 
from seconds to minutes— far shorter than the half-lives of their 
eukaryotic counterparts, even when the mean mRNA half-life is 
adjusted for the cell generation time. (Fig. 6). Only a single value 
for average mRNA half-life is available from an Archaeal species, 
Sulfolobus solfatariais, which is among the slower-growing Ar- 
chaea (some Archaea have fast doubling times similar to those of 
eubacteria), and its RNA half-life value is intermediate between 
eubacteria and S. cerevisiae, a eukaryote with a relatively short 
mRNA half-life (Bini et al. 2002). Examination of the mRNA deg- 
radation components encoded in eubacterial, Archaeal, and eu- 
karyotic genomes shows that the eubacteria and Archaea share 
most of the same genes. Homologs of RNAse E, RNAse II, and 
polynucleotide phosphorylase are readily found by BLAST 
searching against Archaeal genomes. In contrast, Xmlp ho- 
mologs and capping enzyme homologs are absent from eubacte- 
ria and Archaea, but are common to all eukaryotes (Ananthara- 
man et al. 2002). Furthermore, like eubacteria, Archaea organize 
at least some of their genes in operons, and use Shine-Delgamo 
sequences to guide ribosomes to their initiation sites, at least in 



some mRNAs (Shuman 2002), suggesting 
that translational initiation mechanisms in 
Archaea are more similar to those in eubac- 
teria than in eukaryotes. 

Eubacterial Retroelements Have 
Small, Highly Structured RNAs 
With Occluded 3" Ends 
A second prediction of the cap and tail 
model is that those retroelements that are 
found in eubacteria and Archaea will exhibit 
genomic features suggestive of protection 
against RNA degradation, such as short 
length, extensive secondary structure, and 
occluded 3' ends. The two major classes of 
eubacterial retroelements display just these 
features. Retrointron RNA genomes are 
much shorter than retrotransposons and ret- 
roviruses, typically extending only 1-2 kb 
long versus 4-8 kb or more for typical retro- 
transposons and 10 kb or more for typical 
retroviruses. They are highly folded, their 5' 
end is occluded via a 2'-S' linkage and, 
moreover, they are always found in the form 
of a highly specific RNP, in which the RT- 
marurase protein is tightly bound to the in- 
tronic RNA. Importantly, the 3' terminus of 
these molecules consists of a series of Wat- 
son Crick base pairs at the base of the domain VI stem of the 
intron, followed by two or three unpaired bases that can form a 
tertiary interaction with an internal segment in the intron (-y/Y 
sequences; Bonen and Vogel 2001). Similarly, the retron genome 
consists of a small, highly folded molecule in which the 3' end of 
the RNA component is base paired to the 3' end of the DNA 
component (Yamanaka et al. 2002). 

Retrotransposon RNAs Are Capped and Polyadenylated 
Nearly all retrotransposon RNAs contain caps and poly(A) tails, 
as do retroviral RNAs. The case is quite clear for LTR retrotrans- 
posons and retroviruses; there are many reports of poly(A) at the 
3' ends of LTR retrotransposon RNAs, and further evidence for 
posttranscriptionally added 3' poly(A) tails in LTR retrotrans- 
posons can be found readily in EST databases. Capping is more 
laborious to evaluate, but some studies have been performed; for 
example, Tyl mRNA was examined directly and found to be 
capped (Mules et al. 1998b), as are retroviral RNAs. Because LTR 
retrotransposons encode proteins required for their own mobil- 
ity, and these must be translated from their mRNAs, it is ex- 
tremely likely that all LTR-retrotransposun RNAs are capped. 

One of the most important characteristics of non-LTR ret- 
rotransposons is that the vast majority of these elements actually 
encode polyfA) in their DNA. This 3' poly(A) tract defines the 
element's 3' end; many studies suggest that the 3' poly(A) tract 
defines the site at which reverse transcription (TPRT) initiates 
(Moran et al. 1996). These poIy'A) tails are peculiar in that they 
are apparently synthesized, at least in part, by RNA polymerase 
rather than the conventional polyadenylation machinery. How- 
ever, it is possible that the 3' polyfA) residues might be added 
post-transcriptionally using conventional polyadenylation. 
There arc a few non-LTR retrotransposons such as the Drosophila 
I factor, which terminate not in poly(A), but in a related se- 
quence, (TAA)„. Clearly, the 3' end of I factor RNA is not formed 
by conventional polyadenylation, but by transcription. Never- 
theless, the number of TAA repeats can increase during retro- 
transposition, suggesting that a mechanism other than conven- 
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tional polyadenylation leads to the lengthening of the element 3' 
end, probably slippage by the 1 factor RT (Pritchard et al. 1988). 
Interestingly, the I factor 3' sequence can be replaced with 
poly(A), and the modified elements produce progeny elements 
with 3' poly(A) tails (Chambeyron et al. 2002). Intriguingly, a 
significant subset of human LI elements carry a related (TAAA)„ 
repeat in place of poly(A) (Szak et al. 2002). There are a few 
non-LTR Tetrotransposons, such as the CR1 element that termi- 
nate in a 3' terminal-repeated sequence unrelated to poly(A) 
(Burch et al. 1993). Presumably, these mRNAs have found an- 
other way to be circularized during translation, as they must be 
translated. Because this type of element lacking a poly(A)-like 
sequence is rare, I would propose that this is some late evolu- 
tionary adaptation. Clearly, the ancestral state of this family of 
elements is a 3' poly(A )tail. 

Capping, however, has not been directly studied in the non- 
LTR retrotransposons, although the similarity of these elements' 
RNAs to mRNA strongly suggests that they are capped. There is 
evidence that the Drosophila jockey non-LTR retrotransposon is 
transcribed by RNA polymerase II, which is that its mRNA syn- 
thesis is a-amanitin sensitive (Mizrokhi et al. 1988). All known 
pol II mRNAs are capped, therefore, non-LTR mRNAs are unlikely 
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Figure 6 Average mRNA half lives of diverse organisms, adjusted for 
generation time. The data are plotted directly from Bini et al. (2002). 



to be exceptions to this rule. Finally, the sequence of the human 
LI element provides presumptive evidence for capping. Previous 
in vitro studies have shown that various RTs can readily copy the 
G residue comprising the cap, in spite of its unusual S'-S' tri- 
phosphate linkage to the mRNA (Hirzmann et al. 1993; Volloch 
et al. 1995; Mules et al. 1998a). The LI sequence starts with a run 
of a variable number G residues, which I propose has accumu- 
lated through multiple rounds of cap reverse transcription. In 
support of rills exotic idea, the majority of extra single nucleo- 
tides accumulated at the 5' junction of experimentally isolated 
new full-length LI insertions are G residues, whereas truncated 
LI elements do not prefer single-G insertions (Symer et al. 2002; 
N. Gilbert, S.L Lutz-Prigge, and J.V. Moran, pers. comm.). 

A final exception to the general rule that eukaryotic retro- 
transposons are capped and polyadenylated is also instructive 
and supports the model, namely, the case of the Alu element and 
the related SINEs. These unusual elements don't need a cap, be- 
cause they are not translated, but rely on retrotTansposition pro- 
teins encoded by other non-LTR retrotransposons. Intriguingly, 
these elements are polyadenylated through transcription, even 
though they are transcribed by RNA polymerase III and, hence, 
are extremely unlikely to interact with the polyadenylation ap- 
paratus. However, these pol III transcripts lack a S' cap. A differ- 
ent mechanism of protection from 5' exonuclease is adopted by 
these elements; as in the case of eubacterial retroelement 3' ends, 
Ahi and related retroelanents, as well as the tRKA-derived retro- 
elements, are also highly folded and the 5' end of the RNA is 
always found in an extensively base-paired structure (e.g., see 
Sinnett et al 1992), which would protect it against Xrnl-like 5' 
exonucleases. 

Retrotransposon RNA levels are highly variable and tend to 
be tissue specific in metazoans/witlvhigh levels reached only in 
the germ line in most cases (Chaboissier et al. 1990; Branciforte 
and Martin 1994). Naturally, the abundance of retrotransposon 
RNAs is very strongly correlated with retrotransposition fre- 
quency. Because retrotransposition frequencies are set by some 
complex evolutionary interplay unique to each host/retrotrans- 
poson combination, it is not surprising that there is great vari- 
ability in retrotransposon RNA levels. Nevertheless, there are 
some very dramatic cases of very high retrotransposon RNA lev- 
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els that provide strong evidence that the cap and tail structure are 
compatible with high levels of retrotransposon transcript stabil- 
ity. Of note, the Drasophila retrotransposon copia is so named 
because of its incredibly copious mRNA (Young and Hogness 
1977), and yeast Tyl mRNA levels are among the most abundant 
in the yeast cell (Curcio et al. 1990), with Tyl mRNA visible as a 
discrete band in poly(A)-selected RNA preparations. 

In conclusion, the stable and well-punctuated mRNA system 
was probably critical in allowing eukaryotes to evolve an ever 
more complex lifestyle, permitting longer more complex pro- 
teins and increased molecular diversity through alternative splic- 
ing. This same key change probably led to the extensive prolif- 
eration of retroelements, including retroviruses, in the many 
complex guises in which they are found today. 
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mine. A single dose of clozapine increases 
dopamine release in the primate prefrontal 
cortex, and long-term administration in- 
creases basal extracellular dopamine con- 
centration in the prefrontal cortex (21). 
Although this may not be the only mecha- 
nism by which clozapine elicits its effects on 
PCP- induced cognitive dysfunction, this 
activation of the dopamine system of the 
prefrontal cortex may contribute to the 
ability of clozapine to ameliorate the im- 
pairments in our model and, perhaps, in 
schizophrenia. 

Our data show that repeated administra- 
tion of PCP inhibits basal and stimulated 
dopaminergic function in the prefrontal 
cortex of the monkey brain. The deficiency 
of dopamine in the prefrontal cortex that is 
induced by repeated administration of PCP 
is associated with a long-lasting cognitive 
deficit, which is ameliorated by the atypical 
therapeutic drug clozapine. These effects 
are observed long after PCP administration 
is stopped and thus cannot be attributed to 
direct effects of the drug. This primate mod- 
el of dopamine dysfunction in the cortex 
may provide a paradigm for investigating 
the pathophysiology underlying neuropsy- 
chiatric disorders associated with a primary 
cognitive dysfunction in the cortex and a 
dopaminergic deficit in the prefrontal cor- 
tex, as is hypothesized in schizophrenia 
(22). It also may provide a means for eval- 
uating therapeutic agents that are selective- 
ly targeted toward alleviating cortical dopa- 
mine hypofunction. 
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Telomerase Catalytic Subunit Homologs from 
Fission Yeast and Human 

Tom M. Nakamura, Gregg B. Morin, Karen B. Chapman, 
Scott L. Weinrich, William H. Andrews, Joachim Lingner,* 
Calvin B, Harley, Thomas R. Cecht 

Catalytic protein subunits of telomerase from the ciliate Euplotes aediculatus and the 
yeast Saccharomyces cerevisiae contain reverse transcriptase motifs. Here the homol- 
ogous genes from the fission yeast Schizosaccharomyces pombe and human are Iden- 
tified. Disruption of the S. pombe gene resulted in telomere shortening and senescence, 
and expression of mRNA from the human gene correlated with telomerase activity in cell 
lines. Sequence comparisons placed the telomerase proteins in the reverse transcriptase 
family but revealed hallmarks that distinguish them from retroviral and retrotransposon 
relatives. Thus, the proposed telomerase catalytic subunits are phylogenetically con- 
served and represent a deep branch in the evolution of reverse transcriptases. 



Telomerase is a ribonucleoprotein enzyme 
responsible in most eukaryotes for the com- 
plete replication of chromosome ends, or 
telomeres (1 ). Its RNA subunit provides the 
template for addition of short sequence re- 
peats [typically 6 to 26 nucleotides (nts) to 
the chromosome 3' end (2). In ciliated 
protozoa and yeast, telomerase is regulated 
and the average telomere length is main- 
tained (3). In most human somatic cells, 
however, telomerase activity cannot be de- 
tected and telomeres shorten with succes- 
sive cell divisions (4)- Telomerase activity 



reappears in immortalized cell lines and in 
about 85% of human tumors, which has led 
to studies of the usefulness of telomerase for 
cancer diagnostics and therapeutics (5, 6). 

Telomerase RNA subunits have been 
identified and analyzed in ciliates, yeast, 
and mammals (2 , 7), but the protein sub- 
units have been elusive. In Tetrahymem, 
two telomerase-associated proteins (p80, 
p95) have been described (8), and p80 ho- 
mologs have been found in humans and 
rodents (9); the presence of catalytic active 
site residues in these proteins has not been 
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established. Purification of telomerase from 
the ciliate Euplotes aediculatus yielded two 
proteins, pi 23 and p43 (JO), that appear 
unrelated to p80 and p95 (II). pl23 con- 
tains reverse transcriptase (RT) motifs and 
is homologous to yeast Est2 (Ever shorter 
telomeres) protein (1J), which is essential 
for telomere maintenance in vivo ( 12). The 
RT motifs of Est2p are essential for telo- 
meric DNA synthesis in vivo and in vitro 
(11, 13), supporting the conclusion that 
Est2p and pi 23 are the catalytic subunits of 
telomerase. The question remained whether 
there are two telomerases in biology, one 
based on p80- and p95-like proteins and 
one on pl23/Est2p-like proteins. 

To determine if Est2p/pl23 is conserved 
among eukaryotes, we searched for ho- 
mologs in the fission yeast S. pombe and 
humans. Polymerase chain reaction (PCR) 
amplification of S. pam.be DNA was carried 
out with degenerate-sequence primers de- 
signed from the Euplotes pl23 RT motifs B' 
and C. Of the four prominent products 
generated, the ~120-base pair (bp) band 
encoded a peptide sequence homologous to 
pi 23 and Est2p. Using this PCR product as 
a probe for colony hybridization, we identi- 
fied two overlapping clones from a genomic 
library and three from a cDNA library (14). 
None of the three cDNA clones was full 
length, so we used RT-PCR to obtain the 
NH 2 -terminal sequences (15). This puta- 
tive telomerase reverse transcriptase gene, 
trtl + , encoded a basic protein with a pre- 
dicted molecular mass of 116 kilodaltons 
(kD) (Fig. 1A). The sequence similarity to 
pi 23 and Est2p was especially high in the 
seven RT motifs (Table 1 ) and in motif T 
(Telomerase-specific) (Fig. 2). Fifteen in- 
trons, ranging from 36 to 71 bp, interrupted 
the coding sequence. All had consensus 
splice and branch site sequences (16). 

If ml + encodes the telomerase catalytic 
subunit in S. pombe, deletion of the gene 
would be expected to result in telomere 
shortening and perhaps cellular senescence 
as seen with the esf.2 mutants in S. cerevisiae 
(1 1 , 13). To test this, we created two dele- 
tion constructs (Fig. 1A), one removing 
motifs B' through E in the RT domain, and 
the second deleting 99% of the open read- 
ing frame (ORF). Haploid cells grown from 
both types of spores showed progressive 
telomere shortening to the point where hy- 
bridization to telomeric repeats became al- 
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most undetectable (Fig. IB). Senescence 
was indicated by (i) reduced ability of the 
cells to grow on agar, typically by the fourth 
streak-out after germination; (ii) the ap- 
pearance of colonies with increasingly 
ragged edges (Fig. 1C); and (iii) the increas- 
ing fraction of elongated cells (Fig. ID). 
When individual enlarged cells were sepa- 
rated on the dissecting microscope, the ma- 
jority underwent no further division. The 
same trtJ ~ cell population always contained 
nomial-size cells that continued to divide 
but frequently produced nondividing prog- 



eny. The telomerase-negative survivors may 
use a recombinational mode of telomere 
maintenance as documented in budding 
yeast strains with deletions of telomere- 
replication genes (12, 17). 

A candidate human pl23/Est2p/Trtlp 
homolog was identified by a BLAST search 
of the EST (expressed sequence tag) data- 
base (CenBank AA281296). This EST was 
the top-ranked match in sequence searches 
with Euplotes pl23 (P = 3.3 X KT 6 ) and S. 
pombe Trtlp (P = 9.7 X 1CT 7 ). The human 
EST was not found in searches with yeast 





Fig. 1. The gsne for the S. pombe telomerase i 
protein and phenotypes associated with its dele- f 
tion. (A) The trt1 ' locus, the location of the -120 

bp PCR product that led to its identification, and the regions replaced by um4* or his3 + genes in the 
trt1~ mutants (K, Kpn I; Xb, Xba I; H, Hind III; Xc, Xca I; Xh, Xho I). (B) Telomere shortening phenotype 
of trt1~ mutants. A trt1*/trt1~ diploid (28) was sporulated and the resulting tetrads were dissected 
and germinated on a YES ( Yeast Extract medium Supplemented with amino acids) plate (29). Colonies 
derived from each spore were grown at 32°C for 3 days, and streaked successively to fresh YES plates 
every 3 days. A colony from each round was placed In 6 ml of YES liquid culture at 32°C and grown to 
stationary phase, and genomic DNA was prepared. After digestion with Apa I, DNA was subjected to 
electrophoresis on a 2.3% agarose gel, stained with ethidium bromide to confirm approximately equal 
loading in each lane, transferred to a nylon membrane, and hybridized to a telomeric DNA probe. The 
Apa I site is located 30 to 40 bp away from telomeric repeat sequences in chromosomes I and II. (C) 
Colony morphology of trt1 * and trt1 ~ cells. Cells plated on MM [Minimal Medium (29) with glutamic acid 
substituted for NH 4 Cl] were grown for 2 days at 32"C prior to photography. (D) Micrographs of trt V and 
trt1~ cells grown as in (C). 
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Est2p, but subsequent pairwise comparison 
of these sequences showed a convincing 
match. Sequencing of the rest of the 
cDNA clone containing the EST revealed 
all eight TRT (Telomerase Reverse Tran- 
scriptase) motifs, but not in a single ORF 
(J8). We used the sequence information 
from this incomplete cDNA clone to iso- 
late an extended cDNA clone from a li- 
brary of 293 cells, an adenovirus El-trans- 
formed human embryonic kidney cell line 
(19). This cDNA clone (pGRN121) had a 
182 r bp insert relative to the EST clone, 
which increased the spacing between mo- 
tifs A and B' {18) and put all seven RT 
motifs and the telomerase-specific motif T 
motifs in a single contiguous ORF (Fig. 2). 



A 



RT-PCR amplification of RNA from 293 
cells and from testis each gave two prod- 
ucts differing by 182 bp (20). The larger 
and smaller products from testis RNA 
were sequenced and found to correspond 
exactly to pGRN121 and the EST cDNA, 
respectively. 

The relative abundance of hTRT mRNA 
was assessed in six telomerase-negative mortal 
cell strains and six telomerase-positive immor- 
tal cell lines (21) (Fig. 3). The steady-state 
level of hTRT mRNA was higher in immortal 
cell lines with active telomerase (6) than in 
any of the telomerase-negative cell strains 
tested. Telomerase activity was more strongly 
correlated with the abundance of hTRT 
mRNA than with that of telomerase RNA 



&&&& 



(hTR) (7). In contrast, the abundance of 
mRNA for the human p80 homolog TP1 (9) 
did not correlate with telomerase activity 
(Fig. 3). Thus, while our proposal that hTRT 
is the catalytic subunit nf human telomerase is 
based mainly on protein structural features 



Table 1. Amino acid sequence identity between 
telomerase reverse transcriptases. Each value is 
% identity (% similarity In parentheses) based on 
RT motifs 1 , 2. and A Ihrough E in Fig 2C. 



hTRT 


SpTrtlp Est2p 


Eap123 26(49) 


28(45) 24(46) 


Est2p 25 (46) 


27(48) 


SpTrtlp 30(47) 







Motif 1 

Sp Trtlp HNVRMDTQKTTLPPJW] 

hTRT EVRQHREARPSLI.TSKL 

EOP123 KEVEEWKKSLGFftPGKLRLIPKKTT- 

SC Est2p CHNHH^n ' 




Fig. 2. Structure and RT sequence motifs of telomerase proteins. (A) Loca- 
tions of telomerase-specific motif T and conserved RT motifs 1 , 2, and A 
through E (24) are indicated by colored boxes. The open rectangle labeled 
HIV-1 (Human Immunodeficiency Virus) RT delineates the portion of this 
protein shown in (B). pi, isoelectric point. (B) The crystal structure of the p66 
subunit of HIV-1 RT (Brookhaven code 1HNV). Color-coding of RT motifs 
matches that in (A). The view is from the back of the right hand, which allows 
all motifs to be seen. (C) Multiple sequence alignment of telomerase RTs 
and members of other RT families (Sc_al, cytochrome oxidase group II 
Intron 1 -encoded protein from S. cerevisiae mitochondria; Dm_TART, re- 
verse transcriptase from Drosophila melanogaster TART non-T.TR retro- 
transposable element). Boldface residues indicate identity of at least three 
telomerase sequences in the alignment. Colored residues are highly con- 



served in all RTs and shown as space-filled residues in (B). The number of 
amino acids between adjacent motifs or to the end of the polypeptide is 
indicated. TRT con and RT con, consensus sequences for telomerase RTs 
(this work) and non-telornerase RTs (24) (amino acids are designated h, 
hydrophobic, A, L, I, V, P, F, W, M; p, polar, G, S, T, Y, C, N, Q; c, charged, D, 
E, H, K, R). Red arrowheads show some of the systematic differences be- 
tween telomerase proteins and other RTs. Red rectangle below motif 
E highlights the primer grip region discussed in the text. Abbreviations for the 
amino acids are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, 
His; I, lie; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gin; R, Arg; S, Ser;T, Thr; 
V, Val; W, Trp; and- Y, Tyr. The nucleotide sequences of the S. pombe trt1 ' 
geneandthe human TRT cDNA (pGRN1 21 ) have been deposited in GenBank 
. AF01 5783 and AF01 5950, respectively). 
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(similarity of RT motifs, the T motif, molec- 
ular mass > 100 kD, pi > 10), the correlation 
of its mRNA expression level with activity 
also supports this conclusion. 

Sequence alignment of the four telom- 
erase genes revealed features similar to oth- 
er reverse transcriptases, as well as differ- 
ences that serve as hallmarks of the telom- 
erase subgroup. The new T motif is one 
tclomcrase-specific feature not found in the 
other RTs examined. Another is the dis- 
tance between motifs A and B', which is 




..W — w (Hi** M hTR 



~— UwWw«.«*WW-«iW TP1 



Fig. 3. Expression of hTRT in telomerase-negative 
to 6) and telomerase- 
positive immortal cell lines (lanes 7 to 1 2). RT-PCR 
(27) for hTRT, hTR (human telomerase RNA com- 
ponent), TP1 (telomerase-associated protein re- 
lated to Tetrahymena p80), and GAPDH (to nor- 
malize for equal amounts of RNA template) was 
carried out on RNA from: (1) human fetal lung 
fibroblasts GFL, (2) human fetal skin fibroblasts 
GFS, (3) adult prostate stromal fibroblasts 31YO, 
(4) human fetal knee synovial fibroblasts HSF, (5) 
neonatal foreskin fibroblasts BJ, (6) human fetal 
lung fibroblasts IMR90, (7) melanoma LOX IMVI (8) 
leukemia U251, (9) NCI H23 lung carcinoma, (10) 
colon adenocarcinoma SW820. (11) breast tumor 
MCF7, and (12) human 293 cells. 



longer in the TRTs than in other RTs (Fig. 
2A). These amino acids can be accommo- 
dated as an insertion within the "fingers" 
region of the structure that resembles a 
right hand (22, 23) (Fig. 2B). Within the 
motifs, there are a number of substitutions 
of amino acids (red arrowheads in Fig. 2C) 
that are highly conserved among the other 
RTs. For example, in motif C the two as- 
partic acid residues (DD) that coordinate 
active site metal ions (22) occur in the 
context hxDD(F/Y) in the telomerase RTs 
compared to (F/Y)xDDh in the other RTs 

(24) . Another systematic change character- 
istic of the telomerase subgroup occurs in 
motif E, where WxGxSx appears to be the 
consensus among the telomerase proteins, 
whereas hLGxxh is characteristic of other 
RTs (24). This motif E is called the "primer 
grip" (23), and mutations in this region 
affect RNA priming but not DNA priming 

(25) . Because telomerase uses a DNA prim- 
er, the chromosome 3' end, it is not unex- 
pected that it should differ from other RTs 
in this region, Given that the simple 
change from Mg 2+ to Mn 2+ allows HIV RT 
to copy a small region of a template in a 
repetitive manner (26), it is tempting to 
speculate that some of the distinguishing 
amino acids in the TRTs may cause telom- 
erase to catalyze repetitive copying of the 
template sequence of its tightly bound 
RNA subunit. 

Using the seven RT domains (Fig. 2C) 
defined by Xiong and Eickbush (24), we 
constructed a phylogenetic tree that in- 
cludes the four telomerase RTs (Fig. 4). The 
TRTs appear to be more closely related to 
RTs associated with msDNA (multicopy 
single-stranded DNA), group II introns, 
and non-LTR (Long Terminal Repeat) ret- 
rotransposons than to the LTR-retrotrans- 
poson and viral RTs. The relationship of 
the telomerase RTs to the non-LTR branch 
of retroelements is intriguing, given that 
the latter elements have replaced telomer- 
ase for telomere maintenance in Drosophiia 



Fig. 4. A possible phylogenetic tree 
of telomerases and retroelements 
rooted with RNA-dependent RNA 
polymerases. After sequence align- 
ment of motifs 1,2, and A through E 
(178 positions, Fig. 2C) from four 
TRTs, 67 RTs, and three RNA poly- 
merases, the tree was constructed 
using the Neighbor Joining method 
(30). Elements from the same class 
that are located on the same 
branch of the tree are simplified as a 
box. The length of each box corre- 
sponds to the most divergent ele- 
ment within that box. 



(27). However, the most striking finding is 
that the TRTs form a discrete subgroup, 
about as closely related to the RNA-depen- 
dent RNA polymerases of plus-stranded 
RNA viruses such as poliovirus as to retro- 
viral RTs. In view of the fact that the four 
telomerase genes are from evolutionarily 
distant organisms — protozoan, fungi, and 
mammals — this separate grouping cannot 
be explained by lack of phylogenetic diver- 
sity in the data set. Instead, this deep 
branching suggests that the telomerase RTs 
are an ancient group, perhaps originating 
with the first eukaryote. 

The primary sequence of hTRT and 
eventual reconstitution of active human telo- 
merase may be used to discover telomerase 
inhibitors, which in turn will permit addi- 
tional testing of the anti-tumor effects of 
telomerase inhibition. The correlation be- 
tween hTRT mRNA levels and human 
telomerase activity shown here indicates 
that hTRT also has promise for cancer di- 
agnosis. With an essential protein compo- 
nent of telomerase now in hand, the stage is 
set for more detailed investigation of fun- 
damental and applied aspects of this ribo- 
nucleoprotein enzyme. 
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The nexus of chemokine immunobiology 
and AIDS pathogenesis has revealed un- 
tapped avenues for resolving patterns of 
HIV-1 disease progression, for clarifying 
epidemiologic heterogeneity, and for design 
of therapies {1-6). Identification of the 
CC-chemokines, RANTES, MlPlct and 
MIPlp, as suppressor factors produced by 
CD8 cells that counter infection by certain 
HIV-1 strain infections (7) previewed the 
critical identification of two chemokine re- 
ceptor molecules, CXCR4 (formerly named 
LESTR/fusin) and CCR5 (formerly CKR5), 
as cell surface coreceptors with CD4 for 
HIV-1 infection (8-13). Additional che- 
mokine receptors CCR2 and CCR3 also 
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have been implicated as HIV-1 coreceptors 
on certain cell types (12-14). HIV-1-in- 
fected patients harbor predominantly mac- 
rophage-tropic HIV-1 isolates during early 
stages of infection, but accumulate increas- 
ing amounts of T cell-tropic strains just 
before accelerated T cell depletion and pro- 
gression to AIDS. The identification of 
"dual"-tropic HIV-1 strains over the course 
of infection suggests that such strains may 
represent an intermediate between macro- 
phage- and T cell-tropic populations (1 !- 
13,15). This tropic transition indicates that 
viral adaptation from CCR5 to CXCR4 
receptor use may be a key step in progres- 
sion to AIDS (16). 



Contrasting Genetic Influence of CCR2 and 
CCR5 Variants on HIV-1 Infection and 
Disease Progression 

Michael W. Smith,* Michael Dean,* Mary Carrington * 
Cheryl Winkler, Gavin A. Huttley, Deborah A. Lomb, 
James J. Goedert, Thomas R. O'Brien, Lisa P. Jacobson, 
Richard Kaslow, Susan . Buchbinder, Eric Vittinghoff, 
David Vlahov, Keith Hoots, Margaret W. Hilgartner, 
Hemophilia Growth and Development Study (HGDS), 
Multicenter AIDS Cohort Study (MACS), Multicenter Hemophilia 
Cohort Study (MHCS), San Francisco City Cohort (SFCC), 
ALIVE Study, Stephen J. O'Brient 

The critical role of chemokine receptors (CCR5 and CXCR4) in human immunodeficiency 
virus-type 1 (HIV-1) infection and pathogenesis prompted a search for polymorphisms 
in other chemokine receptor genes that mediate HIV-1 disease progression. A mutation 
{CCR2-64I) within the first transmembrane region of the CCR2 chemokine and HIV-1 
receptor gene is described that occurred at an allelefrequency of 1 0 to 1 5 percent among 
Caucasians and African Americans. Genetic association analysis of five acquired im- 
munodeficiency syndrome (AIDS) cohorts (3003 patients) revealed that although CCR2- 
641 exerts no influence on the incidence of HIV-1 infection, HIV-1 -infected individuals 
carrying the CCR2-64I allele progressed to AIDS 2 to 4 years later than individuals 
homozygous for the common allele. Because CCR2-64I occurs invariably on a CCR5- 
+-bearing chromosomal haplotype, the independent effects of CCR5-A32 (which also 
delays AIDS onset) and CCR2-64I were determined. An estimated 38 to 45 percent of 
AIDS patients whose disease progresses rapidly (less than 3 years until onset of AIDS 
symptoms after HIV-1 exposure) can be attributed to their CCR2-+/+ or CCR5-+/+ 
genotype, whereas the survival of 28 to 29 percent of long-term survivors, who avoid 
AIDS for 1 6 years or more, can be explained by a mutant genotype for CCR2 or CCR5. 
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Phylogenetic Relationships of Reverse Transcriptase and 
RNase H Sequences and Aspects of Genome Structure in 
the Gypsy Group of Retrotransposons 1 

Mark S. Springer, * and Roy J. Britten\ 

*Department of Biology, University of California; and fDivision of Biology, California 
Institute of Technology 



The gypsy group of long-terminal-repeat retrotransposons contains elements having 
the same order of enzyme domains in the pol gene as do retroviruses. Elements in 
the gypsy group are now known from yeast, filamentous fungi, plants, insects, and 
echinoids. Reverse transcriptase and RNase H amino acid sequences from elements 
in the gypsy group— including the recently described SURL elements, TED, Cftl, 
and Ulysses, — were aligned and analyzed by using parsimony and bootstrapping 
methods, with plant caulimoviruses and /or retroviruses as outgroups. Clades sup- 
ported at the 95% level after bootstrapping include (I) J 7.6 with 297 and (2) all 
of the SURL elements together. Other likely relationships supported at lower boots- 
trap confidence intervals include ( 1 ) SURL elements with mag, (2) 17.6 and 297 
with TED, and this collective group with 412 and gypsy, (3) 7)7 with Cftl, (4) 
IFG7 with Del, and (5 ) all of the retrotransposons in the gypsy group together, to 
. the exclusion of Ty3. In contrast with an earlier analysis, our results place mag 
within the gypsy group rather than outside of a cluster that contains gypsy group 
retrotransposons and plant caulimoviruses. Several features of retrotransposon ge- 
nomes provide further support for some of the aforementioned relationships. The 
union of SURL elements with mag is supported by the presence of two RNA 
binding sites in the nucleocapsid protein. Location of the tRNA primer binding 
site and the presence of a long open reading frame 3' to the pol gene support the 
17.6-297-TED-412-gypsy cluster. 

Introduction 

Retrotransposons containing long terminal repeats (LTRs) have now been iden- 
tified in the genomes of a number of organisms and can be divided into two groups 
on the basis of both phylogenetic analysis of amino acid sequences and structural 
features of the genome (Xiong and Eickbush 1988, 1990; Doolittle et al. 1989). In 
the copia group, with representatives from Drosophila ( copia and 1731), yeast ( Tyl ) , 
plants ( Tntl, Tal-3, Tstl, Wis, and Bis), and Physarum ( Tpl), the integrase gene is 
located between the protease and reverse transcriptase genes. In the gypsy group, with 
representatives from insects (gypsy, 412, 17.6, 297, mag, micropia, and Ulysses), yeast 
(fy3 and Tfl), filamentous fungi (Cftl), echinoids (SURL elements), and plants 
(IFG7 and Del), the integrase gene is located 3' to the RNase H gene. The gypsy 
group of LTR retrotransposons is related to plant caulimoviruses and to retroviruses, 
on the basis of reverse transcriptase sequences (Xiong and Eickbush 1990). 

1. Key words: retrotransposon, reverse transcriptase, RNase H. 
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Here, we examine phylogenetic relationships among members of the gypsy group 
by using amino acid sequences from the reverse transcriptase and RNase H proteins. 
Previous phylogenetic analyses of the gypsy group include Doolittle et al. ( 1989 ) and 
Xiong and Eickbush ( 1990). Xiong and Eickbush (1990) included 10 elements from 
the gypsy group in their analysis of reverse transcriptase sequences. Since that time, 
sequences for SURL elements, TED, Tfl, Cftl, and Ulysses have become available. 
We also evaluate the distribution and evolution of structural features in these retro- 
transposons in the light of amino acid-based phylogenies. Several structural features 
corroborate phylogenetic analysis on the basis of amino acid sequences. 

Methods 

Amino acid sequences and features of retrotransposons were obtained from 
GenBank and from references given in figure 1. Sequences of representative plant 
caulimoviruses were also obtained from GenBank. Delineation of boundaries for the 
reverse transcriptase protein correspond to that used by Xiong and Eickbush ( 1990). 
Delineation of RNase H sequence boundaries roughly corresponds to the region iden- 
tified by McClure ( 1 99 1 ) . Multiple alignments were made by using CLUSTAL ( Higgins 
and Sharp 1988), and adjustments were made by eye when conserved residues defined 
in Xiong and Eickbush (1990) and McClure (1991) were not aligned. Maximum 
parsimony and bootstrapping were performed by using PAUP, version 3.0s (Swofford 
1991 ), with gaps counted as missing data. Plant caulimoviruses and/or retroviruses 
were used as outgroups. Each step on a parsimony tree corresponds to a single amino 
acid replacement. Because exact methods of finding minimum-length trees could not 
be used for the complete set of sequences, a heuristic approach using 100 replications 
with random input orders was employed. We also used a starting tree consistent with 
the tree given in Xiong and Eickbush ( 1990) as a baseline for searching for shorter 
trees. A distance matrix based on the aligned amino acid sequences Was constructed 
by using the Kimura ( 1983, p. 175) option of the PROTDIST program on PHYLIP 
and was analyzed by using the neighbor-joining method (Saitou and Nei 1987). 

Results 
Alignments 

Figure 1 shows a multiple alignment of amino acid sequences from the reverse 
transcriptase region. Overall, this alignment is similar to that of Xiong and Eickbush 
( 1 990 ) , and most of the conserved blocks in their alignment are retained in the present 
alignment. Figure 2 shows an alignment of sequences from the RNase H region. 

Phylogenetic Trees 

Two minimum-length trees containing 2,2 19 amino acid replacements were found 
for the combined reverse transcriptase/ RNase H sequences. One of these trees, rooted 
by using plant caulimoviruses, is shown in figure 3. On the second tree (not shown) 
the Tfl -Cftl group and IFG7-Del groups switch positions, and micropia is closer to 
Ulysses than to the SURL-mag group. Also shown on the tree in figure 3 are the 
consensus results of 500 bootstrap replications. Results summarized in figure 3 show 
( I ) a likely sister-group relationship (86%) of TED (from the cabbage looper Tricho- 
plusia ni) with 17.6 plus 297 (from Drosophila), (2) a likely sister-group relationship 
(73%) between the plant retrotransposons IFG7 and Del, (3) a likely sister-group 
relationship (80%) between SURL elements and mag, (4) a likely sister-group rela- 
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17.6 DFTKKFTLTTDRSDURLCAULSQDGH — PLSVISRTUNEHEIHVSTIEKELLflIWflTKTFRHVU.GR 

M7 DFEKKFULTTDASNLALGAULSQNGH P I SF I SRTLNDHELM VSfl J EKELLfl I UHftTKTFRHVLLOR 

TED DFTREFHLTTDRSNF A I GRVLSQGP IOSDK PUC VRSRTLNESELH VST I EKELLfl I UHflTK VFRP VLFGR 

412 DFSKEFC I TTDRSKQACGAVLTQNHNGHQL PU AVRSRAFTKGESNKSTTEQELRA IHUfl I IHFRPVIVGK 

gypsy <Di» > OFKKPFDLTTDRS ASG I GftULSQEGR P I TM I SRTLKQPEQHVflTNERELLRI UHflLCKLQHFLVOS 

SE*,. 0 ^ NFQKPFDLTTDRSRSGIGWLSQGHR P I TM I SRflLKQflEQMVftTMEREU.ft I VHftLGRLQHFL VGS 

SURL <Sp > DCKKLTKLSADASKDGIGRULLQQVDQDUU P I RVRSRSHTDRETRVftQIEKELLftlTVflCERFHQVI VGQ 

SURL <Tg > DCNKPTKLSfiDflSKMG I GftULLQQHDENUU P I AVRSRSMTDRETPiVAQ I EKELLAI TVflCERFHQV I VGQ 

SURL <Lv > DCTKPTKI SRDRSKNGLGftULLQQHEQNUH P I flVflSBRHTDRETRVflQ I EKELLfllTVOCEKFHQVI VGQ 

mag DrISLESVLTVDRSABGLCflULflQRGPGCQER WRVASRALTTHELHVSQIHKEALAIUFflVEKFHQYLVOR 

m i crop i a DPQVP I ELHTDRSACGVGfl ILLHRIESKPH VI EVFSKTTTSUESRVHS VELETLR WKflVKHFRHVL I GR 

Ulysses DFRRPFF I QCDASHVGUGAULFQLDDEQQER P I AFFS AKLNKHQ IN VSUTEKECL A AKL A 1 HRFRPWErBI 

IFG7 DFMKTF 1 UECDRSGNG I GWLUQDE I P I RFEGHP I RGKFLHKflLVEKEIILft I LHALKKURPVU1GR 

1 I SG*PFUUVTDASLRGLEGULriqDGR Wfl VflSRQLKUHEHNVPTHDLEL AUU I F I LKLHRH VUVGE 

Tf 1 DFSKK I LLETDASDUftUGAULSQKHDDDK W PUGV VSRKHSKRQLMVSUSDKEHLR I IKSLKHURHVLEST 

Cf U DGSKEUH I ETDASDNA I GACLTQTHOGKRH PURWSRKflTTREQNVDI HDKELLfl I UAfltlQHHRUWEGP 

Ty3 NNKANVRLTTDASKDGIGfttJLEEUDNKNKLUG WGYFSKSLESAQKNVPACELELLGI I KflLHHFR VHLHGK 

COVMU PKDSFI 1 1 ETDOCHTGWGWCKUKtlSKHOPRSTER I CflVRSGSFHP I KS T 1 DRE I QRR I HGLDKFK I WLDK 

CERU EPNDKLU I ETD ASEEFUGO I LKA I HN SHEV ICRVASGSFKRRERMVHSMEKELLRUIRUIKKFS I VLTPS 

FIGUORT KPEDHL 1 1 ETDASDSFUGGULKARALD QUEL 1 CRVSSGSFKQAEKNVHSNOKELLAUKQU I TKFS A YLTPU 

CMU LPEEKL 1 1 ETDASDDVUGGrILK A I K I N -EGTNT EL I CRVASGSFKRAEKNVHSNOKETLAU I HT I KKFS 1 VLTPU 

17.6 - -HFE I SSDHQPLSWLVRMK DPHS -KLTRU R VKLSEFDFDI — KV IKGKENCVRDRLSR I KLEET V 

297 --QFLIflSDHQPLRULHNLK EPGA -KLERU RURLSEVQFK I — DV I KGKENSUADRLSR I K I EENH 

TED KFKILTDHKPLQUtlMHLK DPHS -RflTRU — RLRLSEVDFSU — WKKGKSNTNROHLSRVE I HTTE 

412 HFTUKTDHRPLTVLFSMU NPSS-KLTRI RLELEE VHFTU — EVLKGKDNHUADRLSR I T I KECK 

gypsy <Om > -RE I H 1 FTDHQPLTF AVAOR NTNA -K 1 KRU KSV IDQHNAKU — FYKPGKENFUADRLSRQNLNALQ 

«yp*y <Dv > -REINIFTDHQPLTFAUSOK HTHS -K I KRU KSV IDQHNAKFt — FVKPGKENLUADRLSRQN I HALE 

SURL <Sp> — QVEUETDHKPH I PLFVKSLG DCPL -R I QRL LI RVQRVDLKU — WTPGKVMVTADTLSR AUDPKAE 

SURL <Tg> - -QUEVETOHKPL I PLFUKSLG DCPL -R I QRL LI RVQRVDLKV — M VTPGK VHyTADTLSRAVDPKAD 

SURL <Lv> - -KI EUETDHKPL I PLFUKSLfl DCPL -R I QRL LI RUQRVDLKU — SVTPGKVHFTAOTLSRAVDPK AE 

nag KFI LRTDHKPLUS I FGPH I G IPSA AAS -RLQRU AIKLSAVDFEI --EVURT-DKWflDHLSRHESQKM 

nicropio EFUWTDCNSLKASRTKI DLTP -RVHRU U AYLQSFNFE I --QVREGKRHRHUDFLSRWLSPEH 

Ulyss« — PFTUITDHASLQMLHSLK DLSO-RLARU SLELQ AFPFSH — QVRKG ADNUCRH I URSUEEUELT 

IFG7 HFNUKTDHDSLKVFLEQR LSSE -E*QKM UTKMLGVDFE I - - 1 VKKGKKHWAMHLSRKDEDUEA 

Del DFELVCDHKSLKV I STQK DLML -RQR*U I EULKDFDFS I - -FVHPGKANUUADRLSRKSQ ISHL 

Tfl IEPFKILTDHRNLIRI THGE SEPENKRLRRU QLFLQDFNFE I — NVRPGSANH I RDRLSRI UDETEP 

CfU -PKLT I LSDHKHLT VFTTTK ELTR -RQRRU SELLGQVKFE I — KVTPGTENGPADALSRRSDVT1EG 

Ty3 - -HFTLRTDH ISLLSLQNKN EP AR -RUQRH — -LDDLATVDFTL - -EVLAGPKNWADAI SRA IVTITP 

COYW -KEL I I RSDCER IIKFVN KTNENKP SRURULTFSDFLTGLG I TUTFEH I DGKHHGL ADALSRM I NF I UE 

CERU RFLIRTDNKNFTHFUMINLKGDRKQG--RLURU QflULSQVDFDV — EHIRGTKNUFHDFLQENTLTNW 

FIGUORT - -RFTURTDHKHFT VFLR I HLKGDSKQG --RLURU QNUFSKVQFDV- -EHLEOUKN VLADCLTRDFNA - - - 

COW - -HFLIRTDNTHFKSFUNLHVKGDSKLG — RHIRU QRULSHVSFDU - -EH I KGTDNHFRDFLSREFNKUNS 

FIG. 2.— Alignment of amino acid sequences from the RNase H region for the gypsy group of retro- 
transposons and several plant caulimoviruses. Abbreviations are given in fig. 1. An asterisk (*) denotes a 
stop codon. 



tionship (81%) between Tfl (from fission yeast) and Cftl (from the fungal tomato 
pathogen Cladosporiumfulvum ) , and ( 5 ) a possible clade ( 62% ) containing 1 7. 6, 29 7, 
TED, 412, and gypsy. In addition, Ty3 is an outgroup to all other retrotransposons 
in the gypsy group on 70% of the bootstrap trees. Ulysses and micropia group with 
SURL elements and mag on both minimum-length trees, but this association does 
not hold up after bootstrapping. Likewise, the minimum-length tree shown in figure 
3 supports a clade containing all of the retrotransposons that occur in metazoans, but 
this branch does not occur on the second minimum-length tree, nor is it supported 
by bootstrapping. In contrast to the tree in figure 3, the shortest tree consistent with 
that of Xiong and Eickbush (1990) is 30 steps longer, at 2,249 steps. 

When we converted our sequence alignments to distances by using the Kimura 
option of PROTDIST (PHYLIP, version 3.5; Felsenstein 1993) and then employed 
the neighbor-joining method, the resulting tree (not shown ) showed some differences 
from the minimum-length trees, but all of the branches that are supported at the 50% 
level in figure 3 are also supported on the neighbor-joining tree. 

Minimum-length trees (not shown) based on reverse transcriptase versus RNase 
H sequences exhibit several conflicts; for example, SURL elements cluster with mag 
on reverse transcriptase trees but cluster with the two gypsy elements on RNase H 
trees. However, all of the conflicts involve branches that are not supported after boots- 
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Fig. 3. —One of two minimum-length trees at 2,219 replacements. Numbers above the line are the 
number of amino acid replacements, and numbers below the line are the percentages, from 500 bootstrap 
trials, that support the clade. One asterisk (*) denotes a value of 99%, and two asterisks (**) denote the 
branch of this tree that the root would be on if the reverse transcriptases of the seven retroviruses were used 
for rooting. 
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trapping. Bootstrapping the reverse transcriptases and RNase H sequences, respectively, 
provides support for the following: 17.6 plus 297 (94% and 97%), and for this group 
with TED (85% and 64%); the two gypsy elements together ( 100% and 100%); the 
three SURL elements together ( 100% and 100%) with SURL (Sp) and SURL (Tg) 
as nearest neighbors (96% and 90%); and Tfl plus Cftl (60% and 73%). In addition, 
bootstrapping reverse transcriptase sequences provides support for SURL elements 
with mag (60%), 1FG7 plus Del (71%), and all of the retrotransposons together, 
except Ty3 (52%). 

Features of gypsy-like Elements 

Table 1 summarizes the distribution of seven different features of retrotransposons 
in the gypsy group. The phylogenetic significance of these features is discussed below. 

Discussion 

Xiong and Eickbush (1988, 1990) previously examined relationships among re- 
troid elements, including retrotransposons in the gypsy group, on the basis of reverse 
transcriptase sequences. One of the differences on the Xiong and Eickbush (1990) 
tree is that mag is outside of a cluster containing other retrotransposons in the gypsy 
group as well as plant caulimoviruses. To test this hypothesis with our data, it was 
necessary to include retroviruses as an outgroup to the collective group. We limited 
this analysis to reverse transcriptase sequences because of the difficulty in aligning 
RNase H sequences. Retroviruses clearly root the tree ( not shown ) such that the plant- 
caulimovirus and retrotransposon groups (including mag) are each monophyletic. 

Two other differences on the Xiong and Eickbush ( 1990) tree are as follows: ( 1 ) 
Ty3 is not peripheral to other gypsy retrotransposons but occupies a position close to 
IFG7 and Del, and (2) 412 is the most peripheral member of the gypsy cluster, except 
mag. Whether we ( 1 ) use parsimony or neighbor-joining methods, (2) include RNase 
H and reverse transcriptase or just reverse transcriptase sequences, or (3) restrict our 
analysis to the reverse transcriptase sequences available to Xiong and Eickbush (1990), 
Ty3 occupies the most peripheral position among retrotransposons in the gypsy group, 
and 412 clusters with the insect elements gypsy, 297, 17.6, and TED. 

The overall congruence between reverse transcriptase and RNase H bootstrap 
trees indicates that a similar phylogenetic signal is present in both, although, when 
taken separately, each of these proteins provides less resolution than they do in com- 
bination with each other. One of the implications of the overall congruence between 
bootstrap trees is that reverse transcriptase and RNase H have similar evolutionary 
histories without any interelement recombination that might cause striking differences. 

If Ty3 is taken as an outgroup to all of the other retrotransposons, then the 
implied primitive character states for the characters in table 1 are + 1 ribosomal frame- 
shifting, one RNA binding site, tRNA methionine, a +2 location of the tRNA primer 
binding site (PBS), and lack of a long open reading frame (ORF) 3' to the pol gene. 
On the basis of these designations of primitive character states, several of the aspects 
of genome structure given in table 1 offer additional support for some of the branches 
on the tree in figure 3. First, 17.6, 297, and TED are united by the putative shared 
derived character of tRNA serine, although a putative tRNA serine also occurs in 
Cftl (McHale et al. 1992). Second, 17.6, 297, TED, 412, and gypsy share a number 
of putative derived characters, including a long ORF 3' to the pol gene, a 1-bp overlap 
of the 5' LTR and the tRNA PBS, and -1 frameshifting of the pol gene relative to 
the gag gene, as well as the absence of RNA binding sites in the nucleocapsid protein. 
While two of these derived characters have evolved elsewhere on the tree (i.e., -1 
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frameshifting also occurs in Cftl and mag, and RNA binding sites are absent in 
Ulysses), the presence of a long ORF 3' to pol and a - I location of the tRNA PBS 
are unique to this subset of the gypsy group. Third, the putative relationship between 
mag, SURL elements, and possibly micropia is potentiafly strengthened by the exclusive 
occurrence of two RNA binding sites in the nucleocapsid protein in all of these ele- 
ments. Most retroviruses also possess two RNA binding sites, but in the somewhat 
more closely related plant caulimoviruses there is only a single site. Further support 
for the alliance between mag and SURL elements comes from the observation that 
the number of amino acids separating the two RNA binding sites is identical in these 
elements. Micropia, in turn, has 14 additional amino acids that separate the first and 
second RNA binding sites. The plant elements Del and IFG7 share a number of 
features, such as a single RNA binding site, a single ORF containing the gag and pol 
genes, and a tRNA methionine PBS, but these features appear primitive on the basis 
of their occurrence in Ty3. 

The long LTRs in Ulysses and Del appear homoplastic on the basis of other 
evidence discussed above, whereas the short LTRs in mag are unique to this element. 
Element length ranges from 4,564 bp in mag to 10,653 bp in Ulysses and reflects the 
differences in LTR length. Among other elements, most of the variation results not 
from differences in LTR length but rather from the additional ORF 3' to the pol gene. 

It is interesting that, for the tree in figure 3, all of the animal retrotransposons 
occur on one branch, whereas the two plant elements occur on a second branch. The 
separate clusters of plant and animal retrotransposons suggest that the host phylogeny 
imposes a distinct signature on the phylogeny of the retrotransposons; Flavell (1992) 
previously noted predominantly plant and animal groups for the copia group of re- 
trotransposons as well. Flavell ( 1992) has also characterized the copia group as lacking 
ribosomal frameshifting, whereas in the gypsy group the gag and pol genes are always 
overlapping. However, the presence or absence of overlapping gag and pol genes is 
shown here to exhibit more variation in the gypsy group than was previously recognized. 

In conclusion, our understanding of the phylogeny of the gypsy group of retro- 
transposons is enhanced by considering not only amino acid sequences but also genetic 
features of these elements. Some features (e.g., long 3' ORF) show little or no hom- 
oplasy, whereas others (e.g., type of tRNA PBS) are labile and show much more 
homoplasy. 
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Reverse Transcriptase Motifs 
in the Catalytic Subunit 
of Telomerase 

Joachim Lingner,* Timothy R. Hughes, Andrej Shevchenko, 
Matthias Mann.t Victoria Lundblad.t Thomas R. Cecht 

Telomerase is a ribonucleoprotein enzyme essential for the replication of chromosome 
termini in most eukaryotes. Telomerase RNA components have been identified from 
many organisms, but no protein component has been demonstrated to catalyze telo- 
meric DNA extension. Telomerase was purified from Euplotes aediculatus, a ciliated 
protozoan, and one of its proteins was partially sequenced by nanoelectrospray tandem 
mass spectrometry. Cloning and sequence analysis of the corresponding gene revealed 
that this 123-kilodalton protein (p123) contains reverse transcriptase motifs. A yeast 
(Saccharomyces cerevisiae) homolog was found and subsequently Identified as EST2 
(ever shorter telomeres), deletion of which had independently been shown to produce 
telomere defects. Introduction of single amino acid substitutions within the reverse 
transcriptase motifs of Est2 protein led to telomere shortening and senescence in yeast, 
indicating that these motifs are important for catalysis of telomere elongation in vivo. In 
vitro telomeric DNA extension occurred with extracts from wild-type yeast but not from 
esf2 mutants or mutants deficient in telomerase RNA. Thus, the reverse transcriptase 
protein fold, previously known to be involved in retroviral replication and retrotranspo- 
sition, is essential for normal chromosome telomere replication in diverse eukaryotes. 



Replication of chromosome ends, or telo- 
meres, requires specialized factors that are 
not essential for replication of internal 
chromosome sequences. Conventional 
DNA polymerases cannot fully replicate 
Hunt-ended DNA molecules ( / ) or eukary- 
Otic chromosomes (2), which contain 3'- 
terminal extensions. The key to end repli- 
cation is celomerase, a ribonucleoprotein 
(RNP) enzyme that synthesizes the telo- 
meric DNA repeats (3). The template for 
telomeric repeat synthesis is provided by 
the RNA subunit, which has been identi- 
fied, cloned, and sequenced in ciliated pro- 
tozoa (4, 5), yeast (6, 7), and mammals (8). 

A telomerase RNP was first purified 
from Tetrahymena (9). Two protein compo- 
nents, p80 and p95, were specifically asso- 
ciated with the RNA subunit. Human, 
mouse, and rat homologs of Tetrahymena 
p80 have since been identified and found to 
be associated with telomerase (!0). Al- 
though this evolutionary conservation sug- 
gests that p80 and p95 have important roles 



in telomere replication, their specific func- 
tions remain unclear. Neither protein has 
been reported to be essential for telomere 
synthesis, and neither has significant simi- 



larity to known polymerases or reverse tran- 
scriptases (JJ). 

Telomerase RNP has also been purified 
from Euplotes aediculatus, a hypotrichous cil- 
iate only distantly related to Tetrahymena 
(12). The hypotrichs present a special op- 
portunity for telomere studies because their 
macronuclei contain millions of gene-sized 
DNA molecules. Each cell has about 8 X 
10 7 telomeres (13) and about 3 X lO 5 mol- 
ecules of telomerase (12). Measurements of 
the specific activity of telomerase through- 
out the purification indicated that the ma- 
jor activity present in macronuclear ex- 
tracts was purified (Z2). The active telo- 
merase complex had a molecular mass of 
-230 kD, corresponding to a 66-kD RNA 
subunit and two proteins of 123 kD and 
~43 kD (J 2). Photocross-linking experi- 
ments implicated the larger protein in spe- r 
cific binding of the telomeric DNA sub- c 
■tntt(M). 

Here wc characterize the pi 23 compo- £ 
nent of Euplotes telomerase and show that it ; 
contains sequence hallmarks of reverse tran- ! 
scriptases. Furthermore, it is the homolog of j 
a yeast protein, Est2p, shown previously to L ' 
function in telomere maintenance. Our ge- i 
netic and biochemical analyses show that J 
the reverse transcriptase motifs of Est2p are ; 
essential for telomeric DNA synthesis in : 
vivo and in vitro. We propose that' telo- j 
merase, frequently called "a specialized re- ; 
verse transcriptase," is in f; 
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Fig. 1 . Sequencing of the p1 23 subunit of telomerase by nanoelectrospray tandem mass spectrometry. 
(A) Mass spectrum of the unseparated peptide mixture. All peptides that were sequenced completely or 
partially are marked by the letter T or t, respectively (75). The eight peptide ions from which sequence 
tags were generated are marked by filled circles. Most unlabeled peaks correspond to trypsin autolysis 
products. (B) Tandem mass spectrum of the doubly charged precursor at the mass-to-charge ratio 
(m/z) of 830.4 in (A). Interpretation of the fragment ion mass in (B) and comparison with the esterified 
form of the peptide allowed the sequence assignment. 
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scriptase in terms of its catalytic active site. 

Determination of Euphtes pi 23 se- 
quence. The genes encoding the telomerase 
protein sublimits from E. aediculatus were 
isolated by reverse genetics. Telomerase was 
purified and polypeptides were separated on 
SDS-polyacrylamide gels. Amino acid se- 
quencing of the trypsin-digested pi 23 band 
was accomplished by nanoelectrospray tan- 
dem mass spectrometry (15-17), a minia- 
turized form of electrospray (18) that allows 
mass spectrometric interrogation of minute 
analyte volumes for extended periods of 
time due to its low flow rate. No chroma- 
tography is needed, because the unfraction- 
ated peptide mixture obtained after diges- 
tion of the protein in a gel slice is separated 
and sequenced in the spectrometer. For 
pi 23, 14 peptides were sequenced de novo 
(Fig. 1) (15). 

Two of the peptide sequences were used 
to design degenerate polymerase chain re- 
action (PCR) primers (arrows in Fig. 2) to 
amplify a portion of the macronuclear 
gene encoding pl23. A genomic library 
was prepared from macronuclear DNA 
and screened with this fragment to isolate 
the full-length gene (19). The pl23 gene 
was found to be encoded by a 3279-base 
pair (bp) macronuclear chromosome con- 
taining an uninterrupted 1031-amino acid 
open reading frame. In a Southern (DNA) 
blot experiment the PCR fragment hybrid- 



ized to a single macronuclear chromosome 
of —3.3 kb (20). The open reading frame 
predicts a protein of 122,562 daltons, cor- 
responding to the size estimated by SDS- 
polyacrylamide gel electrophoresis of puri- 
fied protein [120 kD (12)]. More than 150 
amino acids identified in the purified 
polypeptide by mass spectrometry could be 
assigned in the open reading frame (Fig. 
2). This includes all 14 peptides that were 
completely sequenced. The tandem mass 
spectra of 10 additional peptides also 



Non-LTR-relroposon: "jlf^^-: ■ fl- ■ ftfc : - , 

Group™: ^^4^ " - 

Refuses: 

Egrerorc (nh 2 ) SSBBBIt- vSJBB(cooh) p ;=io.o 



matched the gene sequence through par- 
tial sequences or peptide sequence tags 
(21). 

Reverse transcriptase motifs in Eu- 
phtes pi 23 and its yeast homolog Est2p. 
In a BLAST search of protein databases, 
Euphtes pi 23 was found to be most similar 
to Saccharomyccs cerewiae Est2p (P = 7 X 
10 -7 ) and to a group II in tron- encoded 
reverse transcriptase from the cyanobacte- 
rium Cabthrix (P = 2 X 10~ 4 ) (22, 23). 
Yeast Est2p has a predicted molecular 

Fig. 3. Block diagrams of 
p123 and Est2p and com- 
parison of the reverse tran- 
scriptase (RT) domains with 
those of other reverse tran- 
scriptases. The spacing of 
sequence motifs (red) is di- 
agnostic for each reverse 
transcriptase family (27). In 
the consensus sequence, 



ISLVGTYAFVDLLINYTVTQFNGQ . FFfQIVONRC .NEPHLPPKWVQRSSSSSATAAQIK 
IKNNISAK . DR A0TIF7WTFI! FOTIRKXITOKVIEKIAYro.gKVKDFHFMYYLTKSCPLPBNWRERKQKI ENLI NKTKEE 



XLKDFRWLFISDIWFTKHNFEHLNQLAICFISWLFRQLIPKIIQ 




Fig. 2. Sequence alignment of Euplotes (Est) p1 23 and yeast [S. cerevisiae (Sc)J 
Est2p (50). Identical amino acids are noted in boldface. The PCR primers used 
to amplify a portion of the gene are indicated by the arrows. Assigned reverse 
transcriptase motifs [designated by letters (26) or alternatively by numbers in 
parentheses (27)] are shown in orange, with the most highly conserved amino 
acids in red. In the consensus sequences of the motifs, h designates a hydro- 
phobic amino acid, p a polar amino acid, and + a positively charged amino 



acid. The underlined sequences in p123 are the 14 peptides completely se- 
quenced by nanoelectrospray tandem mass spectrometry. The dashed lines 
below the p1 23 sequence indicate another 1 0 peptides whose tandem mass 
spectra matched the sequence. One of the peptides contained an acetylated 
methionine (solid triangle) at its NHj-terminus, indicating that it was the NH 2 - 
ierminal peptide of the protein. The nucleotide sequence of the ajpfcrtes p123 
gene has been deposited in GenBank (accession number U95964). 
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mass of 103 kD and, like pi 23, is very 
basic (Fig. 3). Although the overall se- 
quence identity of Euplotes pi 23 and yeast 
Est2p is only 20% (Fig. 2), sequence sim- 
ilarity (correspondence of acidic, basic, 
hydrophobic, and hydrophilic amino ac- 
ids) can be detected over the entire length 



of the two proteins. 

The EST2 (ever shorter telomeres) gene 
was one of four complementation groups 
identified by screening yeast mutants for 
reduction in telomere length and senes- 
cence (24, 25). Epistasis analysis had indi- 
cated that the four EST genes function in 



the same pathway as TLCJ, the gene en- 
coding the telomcrasc RNA subunit (6), 
suggesting that the EST genes encode ei- 
ther components of the telomerase or pos- 
itive regulators of its activity. The homol- 
ogy of yeast Est2p with Euplotes pi 23, the 
latter isolated because of its physical asso- 




Senescence phenotype: 



Fig. 4. In vivo functional analysis of 
motifs in Est2p. (A) The 12 

within the reverse transcriptase domain of Est2p are indicated by 
downward an aws (red, telomerase-conserved residues; black, 
nonconserved residues). The phenotypic effects of the muta- 
tions are indicated by solid triangles (strong mutant phenorype) 
and open triangles (weak mutant phenotype). The sequence 
alignment includes members of three other reverse transcriptase 
families (27). Boldface residues indicate identity of at least two 
sequences in the alignment. See (50) for amino acid abbrevia- 
tions. (B) Senescence phenotype of esf2 mutants shown by 
spreading single colonies on plates (51). Photographs were tak- 
en after -75 generations of growth. (C) Telomere length of esf2 
mutants. Southern blot of genomic yeast DNA, hybridized 
with a telomere-specific probe (24). Single-copy plasmids 
carrying the wild-type EST2 gene (lanes 2 and 15), the indi- 
cated esf2 mutant genes (lanes 3 to 14), or empty vector 
(lane 1) were transformed into an esf2-A strain. Genomic 
DNA was prepared after ~75 generations of growth, at a 
time of maximal senescence for an es(2 null strain. The 
bracket and four small arrows indicate telomeric bands, and 
the two larger arrows indicate the subtelomeric repeat frag- 
ments that are amplified late in the growth of esf2 mutant 
strains (24, 28). Five independent transformants of each mis- 
sense mutant were assayed, one of which is shown. (D) 
Dominant-negative effect, resulting from overexpression of 
certain Est2p mutants. A Southern blot of genomic yeast 
DNA; prepared from a wild-type EST2* strain transformed 
with a high-copy plasmid expressing wild-type or the indicat- 
ed mutant esf2 genes, was developed as in (C). In each case, 
the £ST2 promoter was replaced with the constitutive pro- 
moter of the alcohol dehydrogenase IfiDHj gene. Lanes 1 
and 16, empty vector. Two transformants are shown for 
each mutation after ~50 generations of growth. Additional 
growth resulted in further telomere shortening, although this 
additional length reduction is not sufficient to confer a senes- 
cence phenotype (52). 
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ciation with telomerase RNA and its co- 
purification with telomerase activity, sup- 
ported the proposal that both proteins 
are intrinsic subunits of their respective 
telomcrascs. 

Euplotes pl23 contains reverse transcrip- 
tase motifs, and the alignment reveals the 
presence of these motifs in a similar region 
of Est2p (Fig. 3). The primary sequences of 
reverse transcriptases are highly divergent: 
Only a few amino acids are absolutely con- 
served within separate short motifs (26, 
27), but these motifs are believed to form a 
common tertiary fold. Both pi 23 and Est2p 
contain these key conserved amino acids, 
most notably the three invariant aspartates 
in motifs A and C, which are thought to be 
directly involved in catalysis (Fig. 2). Con- 
served motifs are spaced differently in the 
two major branches of reverse transcrip- 
tases, those encoded by retroviruses and 
long terminal repeat (LTR) retroposons 
and those encoded by non-LTR retro- 
posons and group II introns (27). The 
spacing of sequence motifs in pi 23 and 
Est2p resembles that in the latter branch. 
However, the interval between motifs A 
and B' in pi 23 and Est2p is unusually 
large (Fig. 3), suggesting that these two 
polypeptides may be members of a previ- 
ously unknown subcategory. 

Requirement of the reverse transcrip- 
tase motifs for Est2p function in vivo. The 
presence of reverse transcriptase motifs in 
both pi 23 and Est2p suggests that this region 



may define the catalytic active site of telo- 
merase. To test the importance of these mo- 
tifs for Est2p function, we used site-directed 
mutagenesis to change conserved and non- 
conserved aspartic acid (D) and glutamine 
(Q) residues in and around motifs A, B', and 
C to alanine (A) (Fig. 4A). Each mutant, 
present on a single-copy ARS CEN plasmid, 
was tested for in vivo function in a comple- 
mentation assay. Plasmids were transformed 
into the est2-A strain (A designates dele- 
tion), in parallel with either the empty vec- 
tor or an EST2 + plasmid. Transformants 
were assessed for the senescence phenotype 
(Fig. 4B) and for chromosome telomere 
length (Fig. 4C). 

Consistent with the prediction that the 
reverse transcriptase motifs are required for 
Est2p function, mutation of any of the three 
conserved aspartates in motifs A and C 
prevented normal telomerase activity. 
Transformants expressing these mutant pro- 
teins became senescent and had shortened 
telomeric tracts, phenotypes indistinguish- 
able from those of the null mutant (Fig. 4, B 
and C). Furthermore, a bypass pathway for 
telomere maintenance (28) was evident in 
these three mutant strains. Activation of 
this alternative pathway occurs as the result 
of a global amplification and rearrangement 
of both telomeric G-rich repeats and subte- 
lameric regions, and has only been observed 
in est and tic I mutant strains with a severe 
telomere shortening phenotype (24, 29). A 
feature of this pathway is the amplification 



Fig. 5. Sedimentation of 
telomerase. Yeast ex- 
tract was fractionated on 
a glycerol gradient (32), 
and telomerase RNA was 
detected by Northern 
blotting (bottom) and its 
concentration quantified 
on a Phospholmager 
(top). Detection of U1 
snRNP served as an in- 
ternal control. Fractions 
pooled for activity assays 
are Indicated. Telomer- 
ase RNP sedimented as 
a 19S to 20S particle, 
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telomerase RNA sedi- 
mented at ~17S, The 
sedimentation value was 
determined relative to 
marker proteins that 
were run in parallel gradi- 
ents and that had sedi- 
mentation coefficients of 
7.6S (alcohol dehydroge- 
nase), 11.3S (catalase), 
17.3S (apoferritin), and 
19.3S(thyroglobulin). 




•*-U1 snRNA 



of two subtelomeric bands (Fig. 4C); these 
diagnostic restriction fragments were sub- 
stantially amplified only in the est2 null 
mutant and the three proposed active site 

Mutations of amino acids other than the 
three most conserved aspartates had less 
severe or no phenotypic effects. The residue 
Asp 5 ' 6 of motif A is conserved between 
Est2p and pi 23, and the D536A mutation 
(Asp mutated to Ala at position 536) 
caused substantial telomere shortening and 
a modest senescence phenotype. Of the 
conserved residues tested, Gln W2 of motif 
B' was the only one that was functionally 
insensitive to replacement with alanine. 
However, this glutamine is not strictly con- 
served in reverse transcriptases (27), and 
when it is changed to alanine in human 
immunodeficiency virus- 1 (HIV-1) reverse 
transcriptase, polymerase activity in vitro is 
reduced hut not completely eliminated 
(30). In contrast to the phenotypes seen 
upon mutation of the semiconserved amino 
acids, mutation of six of the seven noncon- 
served amino acids tested showed little or 
no alteration of Est2p function. 

Two observations indicate that stable 
Est2 protein was produced in the five est 
mutants with a diminished capacity to com- 
plement the est2-A strain. First, Myc 3 - 
epitope-tagged versions of each mutant pro- 
tein were visualized immunologically after 
immunoprecipitation (31). Second, overex- 
pression of each of the five mutant alleles in 
a wild-type yeast strain with a functional 
chromosomal EST2 + gene resulted in telo- 
mere shortening (Fig. 4D), whereas overex- 
pression of the wild-type EST2 gene had 
little effect. The dominant-negative pheno- 
type shows that each mutant protein is 
being made and suggests that excess mutant 
Est2p can titrate components away from the 
wild-type telomerase complex. 

Requirement of EstZp for telomerase 
activity in vitro. If Est2p is the catalytic 
protein subunit of telomerase, then telo- 
merase activity should be abolished in e.st2 
mutant extracts. An in vitro assay was de- 
veloped with extracts fractionated by glyc- 
erol gradient centrifugation (32). Telomer- 
ase-containing fractions were identified by 
detection of the RNA subunit on Northern 
blots (Fig. 5). Yeast telomerase sedimented 
as a 19S to 20S particle, substantially faster 
than the sedimentation of the deprotein- 
ized telomerase RNA (~17S). Telomerase- 
containing glycerol gradient fractions were 
pooled, concentrated, and tested for the 
ability to elongate a single-stranded telo- 
meric oligonucleotide (Fig. 6). An activity 
was detected in wild-type extracts that had 
the characteristics of telomerase. It was de- 
pendent on the presence of oligonucleotide 
substrate and fractionated extract (Fig. 6A, 
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lanes 1 to 3). Addition of T and G residues 
occurred in an ordered manner consistent 
with the expected alignment of substrate 
■ and RNA template (Fig. 6A, lanes 5 and 6). 



The 
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is of ribonuclease (RNe 
not stimulated by adenosine triphosphate 
(ATP) (Fig. 6B). These characteristics, in 
addition to the observed single round of 
extension of primer (Fig. 6A), are similar to 
those of the telomerase activity described by 
Blackhum and co-workers (33). A different 
activity described as telomerase by Lue and 
Wang (34) gives rise to long products and is 
stimulated by ATP. This latter activity was 
not detectable in our telomerase-containing 
glycerol gradient fractions. 

A telomerase RNA template mutation 
that alters the specificity of nucleotide in- 
corporation to produce a Hae III restriction 
site (6) provides an additional test for the 
authenticity of the in vitro telomerase as- 
say. An extract of this TLC I - 1 (Haelll) mu- 
tant, fractionated on a glycerol gradient, 
gave the predicted extension of the telo- 
meric oligonucleotide only in the presence 
of deoxycytidinc triphosphate (dCTP) (Fig. 
6C, lanes 6 to 8), a nucleotide that has no 
effect on extension by a wild-type extract 
(Fig. 6C, lanes 2 and 3). This nucleotide 
specificity change supports the dependence 
of the assay on the TLC I RNA. Because 
the TLCJ-i (Haelll) strain also undergoes 
senescence (29), this result also provides 
confidence that telomerase activity can still 
be detected in senescing cells, as long as 
they are not subcultured too extensively. 

We then assayed fractionated extracts 
from e.«2-A and tic /-A strains for telomerase 
activity (Fig. 6D). As expected, no activity 
was detectable in c!ci-A yeast, which has 
the gene for telomerase RNA deleted. In 
the es£2-A strain, telomerase RNA was still 
assembled into an RNP, as assessed by glyc- 
erol gradient centrifugation and Northern 
blotting (32), but telomerase activity was 
completely absent. This indicates that 
Est2p is essential for telomerase activity. As 
described above, the absence of activity is 
not simply a secondary consequence of se- 
nescence. We also measured telomerase ac- 
tivity in extracts from est2-A and strains 
expressing two of the proposed active site 
mutants in the presence of the chain-termi- 
nating analog ddGTP (Fig. 6E). According 
to the proposed primer-template alignment, 
extension should terminate after addition of 
two nucleotides. A practical advantage is 
the higher signal-to-noise ratio obtained 
when all products are concentrated in one 
or two bands. Again, activity was depen- 
dent on functional TLC) and EST2 genes. 

Telomerase structure. The presence of a 
reverse transcriptase domain in the catalytic 
subunit of telomerase provides a framework 



for exploring the structure and mechanism of 
this enzyme. Reverse transcriptases have 
been studied in great detail, and the three- 
dimensional structure of HIV- 1 reverse tran- 
scriptase has been solved (35). The structure 
can be compared with a right hand with 
fingers, palm, and thumb, with the active 
site residing in the palm (36). A model for 
telomerase structure based on that of HIV-1 
reverse transcriptase (HIV-1 RT) is shown 
in Fig. 7 with the telomerase RNA and a 
telomeric DNA substrate superimposed. 

The catalytic subunit of telomerase has 
several features that distinguish it from oth- 
er reverse transcriptases. Telomerase uses 
only a small portion of its RNA subunit as 
a template. The borders of this template 



must somehow be recognized. Furthermore, 
during processive synthesis of telomeric re- 
peats the substrate translocates from one 
end of the template to the other by an as yet 
unknown mechanism. The large gap be- 
tween motifs A and B' of telomerase pi 23 
and Est2p indicates an unusual finger do- 
main structure. In HIV-1 RT this domain 
may be involved in template strand binding 
(35, 36); whether and how it contributes to 
the unusual reaction mechanism of the te- 
lomerase RNP remain to be investigated. 
Finally, the telomerase protein is stably as- 
sociated with its RNA subunit, as shown by 
our isolation of the Eupbtes pI23-RNA 
complex and by coimmunoprecipitation of 
the yeast RNA subunit with Est2p (31). 
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Fig. 6. In vitro functional analysis of 
reverse transcriptase motifs in £ 
Est2p. Telomerase was partially pu- c 
rifled by glycerol gradient centrifu- 
gation and assayed for the ability to c 
extend a telomeric DNA substrate : 
(32). In the assay [a 32 P]dTTP was c 
included to visualize products elon- j 
gated by 1 , 2, 3, or4 nucleotides (4-1 , < 
+2, +3, or +4). (A) The telomerase 1 
RNA template region maximally \ 
base-paired to the DNA substrate is j 
indicated schematically. Product ( 
lengths were determined relat'we to j 
the same DNA substrate extended by j 
one nucleotide at its 3' end by reac- j 
tion with [a^PjddTTP and terminal < 
; deoxynucleotidyl transferase (lane 4). •■ 
— • - 1 1 Up to seven nucleotides were added < 
in the presence of dGTP and dTTP ; 
7 8 9 (lane 3), one nucleotide in the pres- > 
KrMi-ISkma ence of only dTTP (lane 5), and two j 

nucleotides in the presence of dTTP j 
and the chain-terminating analog J 
ddGTP (lane 6). Oligo, oligonucleo- ~\ 
tide. (B) Effect of RNase A and ATP on telomerase activity. Standard "] 
reaction Pane 1), standard reaction plus 1 mM ATP and 1 mM ! 
additional MgCl z (lane 2), and standard reaction plus RNase A at 0.1 ] 
ng/nl (lane 3), 1 ng/jd (lane 4), and 1 0 ng/|d (lane 5). (C) Specificity of j 
nucleotide incorporation dictated by the RNA template sequence. C 
Product lengths were determined relative to DNA markers that had 
been extended by [a 3Z PJddTTP (lane 1) or [ot 32 P]ddCTP (lanes S 
and 9) as in (A). Note that these two markers had slightly different 
mobilities on the polyacrylamide gel. The mutant TLC1-1 (Haelll) 
telomerase RNA template is indicated with the substrate bound in 
the most stable register. Consistent with this alignment and the 
mutated template sequence, efficient extension required the pres- 
ence of dCTP (lanes 6 and 7). Telomerase in extract from TLC1-wt 
cells ( WT) [see (A) for template sequence] was not influenced by the 
presence of dCTP (lanes 2 and 3). (D) Requirement of functional 
EST2 and TLC1 products for telomerase actMy. Fractionated ex- 
tracts from wild-type (lanes 1 and 2) and the indicated mutant strains 
fjanes 3 to 6) (32) were tested at two extract concentrations. Reac- 
tions 1, 3, and 5 contained 10% (v/v) of telomerase fraction, and 
reactions 2, 4, and 6 contained 20%. (E) Alleviation of telomerase 
activity by active site mutations in Est2p. All assays included the 
chain terminator ddGTP (100 uM). The reactions contained 10% 
(y/v) of telomerase fraction (lanes 1 to 6) or 5% of each of the 
indicated fractions (lanes 7 to 9). The results of the mixing experi- 
ment (lanes 7 to 9) indicate that the absence of activity is not due to 
an inhibitor in the mutant extracts. 
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This last feature distinguishes telomerase 
from the retroviral and LTR retroposon re- 
verse transcriptases, but is similar to some 
mitochondrial and group II intron-encoded 
reverse transcriptases that also form com- 
plexes with their RNA templates (37). 

Reverse transcriptase essential for 
chromosome replication in diverse eu- 
karyotes. Reverse transcriptases have not 
previously been considered essential for 
normal cell physiology. Initially discovered 
as retroviral enzymes that catalyze the de- 
fining RNA-to-DNA step of retroviral rep- 
lication (38), they were later found to me- 
diate the transposition of DNA elements 
within eukaryotic genomes through an 
RNA intermediate (39). Reverse transcrip- 
tases are also present in some prokaryotes 
(40) and in Neurospcrra mitochondria (41), 
where they replicate genetic elements that 
are nonessential to their "host." Our discov- 
ery that a structurally related enzyme is 
essential for chromosome replication and 
cell division provides another example of 
the opportunism of nature: once a useful 
protein motif is stumbled upon, natural se- 
lection promotes its exploitation in diverse 

The evolutionary relationship between 
telomerase and the other reverse transcrip- 
tases is intriguing. It is well established that 
retroviruses acquired oncogenes such as v- 
sre, v-abl, v-ras, and v-fos from cellular ge- 
nomes. According to Tcmin's protovirus 
hypothesis, retroviruses also acquired their 
reverse transcriptase gene from normal 
cells, where the enzyme presumably con- 
tributed to some normal cellular process 
(42). Could this cellular source have been 
the telomerase pl23/EST2 gene, which mu- 
tated so that the protein product used an 



Fig. 7. Model of telomer- 
ase as an RNA-reverse 
transcriptase complex. 
The p123/Est2p subunit 
(green) is based on the 
right hand model of 
HIV-1 RT (36); thumb 
and fingers extend to- 
ward the reader. Motifs 
A, B', C, and D are in the 



3' end of the telomeric 
DNA substrate (red). The 
RNA subunit (purple) has 
its template region in the 
palm; the location of the 
remainder of the RNA is 
unknown and is shown 
schematically in its sec- 
ondary structure repre- 
sentation (5). Additional 



exogenous rather than an intrinsic RNA 
template? Alternatively, telomerase and the 
reverse transcriptases encoded by retro- 
transposons and retroviruses may all be de- 
scendants of an ancestral protein that 
emerged from an "RNA world" (43). 

Telomere replication in the fruit fly Dro- 
sophih has been mysterious because this or- 
ganism does not have short repeated telo- 
meric sequences and presumably no telom- 
erase. Rather, the non-LTR retroposons 
HeT-A and TART cap the chromosome 
ends (44). The TART reverse transcriptase 
is closely related to pi 23 and Est2p, which 
suggests that the Drosophik telomere repli- 
cation machinery may in fact not be so 
different from that of other eukaryotes (45). 

We have no satisfactory explanation for 
the lack of correspondence between the 
Euphtes and yeast pl23/Est2p proteins and 
the Terrafrymena p80 or p95 protein (9). 
The small protein subunit of Euplotes te- 
lomerase (p43) also shows no similarity to 
the TetrafVjmena proteins (46), and the 
complete yeast genome sequence does not 
reveal obvious p80 and p95 homologs. 
There are three possible explanations: (i) 
Tetrahymena may have a different telomer- 
ase in which p80 and p95 provide the active 

than once in evolution), (ii) Tetrahymena 
may have two telomerases, one containing 
p80 and p95 and one (unisolated) contain- 
ing a pl23/Est2p homolog (for example, 
one telomerase for de novo telomere forma- 
tion during macronuclear development and 
one for telomere replication), (iii) The Tet- 
rahymem p80-p95-RNA complex may not 
be an active enzyme but may require a 
pl23/Est2p subunit that was underreprc- 
sented upon purification of the particle. 



Mass spectrometric methods have re- 
cently become very successful for the iden- 
tification of proteins whose genes are al- 
ready partially or completely contained in 
sequence databases (47). The sequencing of 
more than ISO amino acids of the pi 23 
telomerase subunit at protein amounts too 
low for chemical methods shows that mass 
spectrometry is now also valuable for se- 
quencing previously unidentified proteins. 

Telomerase activation accompanies the 
immortalization of cultured mammalian 
cells and is also a common property of 
human tumor cells (48). Thus, telomerase is 
considered to be a potential target for the 
development of tumor-specific drugs. Cer- 
tain reverse transcriptase inhibitors devel- 
oped as anti-HIV drugs have already been 
tested against telomerase with some success 
(49). The finding that the telomerase ac- 
tive site is related to that of known reverse 
transcriptases is expected to stimulate such 
efforts. 

AND NOTES 




protein subunits may be associated (not shown). The telomeric DNA substrate is shown base-paired but 
not intertwined with the RNA subunit. The extent of base pairing and the sites of interaction of the nucleic 
acids with the protein are not known. 
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Functional analysis of HIV-1 reverse transcriptase motif C: site- 
directed mutagenesis and metal cation interaction. 

Valverde-Garduno V. Gariglio P. Gutierrez L. 

Departamento de Virus y Cancer, Institute Nacional de Salud Publica, 
Cuernavaca, Morelos, Mexico. veronica@ibt.unam.mx 

Motif C, present in all polymerases, has been proposed to be part of the catalytic 
and metal binding site of the enzyme, suggesting that polymerases have a 
common origin. Previously, we have shown that the metal ion manganese induces 
alterations in nucleotide substrate specificity in some polymerases. However, it is 
not known if the active site responsible for incorporation of nonspecific substrates 
is the same as that which incorporates specific ones. Here we show that 
manganese enables HIV-1 reverse transcriptase (RT) to incorporate rNTP's using 
RNA as a template, thus behaving as an RNA replicase. Also, we show that the 
mutation Dl 86H in motif C strongly affects the natural DNA polymerase activity 
and that the RNA replicase activity becomes undetectable, suggesting that both 
activities depend on the same active site. This mutation changes the metal ion 
preference, with mutant RT presenting only 0.5% of the wild-type DNA 
polymerase activity in the presence of magnesium but 1 .6% of the same activity 
in the presence of manganese. This variation in cation preference suggests that 
residue Dl 86 is part of the metal binding site. Since residue D186 of motif C is 
essential for both activities and appears to be involved in the binding of an 
important cation needed for the specific activity, our results support the idea of a 
common origin for all polymerases, from an ancestral unspecified polymerase 
containing at least motif C. 



EXHIBIT 7 



' AIDS RESEARCH AND HUMAN RETROVIRUSES 
Volume 16, Number 8, 2000, pp. 721-729 
Mary Ann Liebert, Inc. 



C138 
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Reverse Transcriptase Sequences with a 
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ABSTRACT 

We have developed a highly sensitive, universal assay that allows detection as well as identification of all 
known retroviral reverse transcriptase (RT)-related nucleic acids in a biological sample by a single two-step 
experiment. The assay combines polymerase chain reaction (PCR) and reverse dot-blot hybridization (RDBH), 
using an array of immobilized synthetic retrovirus-specific oligonucleotides and two sets of mixed oligo primers 
(MOPs). These primers were derived from highly conserved motifs found in all known reverse transcriptase 
genes. The PCR/RDBH assay was used for qualitative analyses of human endogenous retrovirus (HERV) tran- 
scription in peripheral blood mononuclear cells (PBMCs) and in particles released by the human mammary 
carcinoma-derived cell line T47D. Sensitivity was further demonstrated by detection of down to 10 copies of 
pig endogenous retrovirus (PERV) DNA in human cDNA samples. Therefore, this assay is particularly use- 
ful for the identification of retroviral sequences in xenografts as Well as in recipients of xenografted tissues 
and organs. Moreover, it is a valuable tool to detect retroviral transcripts and particles in cell cultures used 
for production of therapeutic polypeptides. The assay is further suitable for monitoring vector preparation 
used in human gene therapy to exclude transfer of copackaged endogenous retroviruses into target cells. 



INTRODUCTION 

The genomes of all vertebrates contain a wide spectrum 
of endogenous retroviruses (ERVs) and reverse transcrip- 
tase (RT)-related sequences. For example, human ERVs 
(HERVs) are estimated to comprise at least 1-2% of the hu- 
man genome. 1,2 Although most of these sequences are assumed 
to be defective, some retain certain biological activities and thus 
represent a reservoir of retroviral genes with pathogenic po- 
tential. Characterization of particles released by the human 
breast cancer-derived cell line T47D revealed that comple- 
mentation between several expressed HERVs can lead to 
pseudotype particles packaging retroviral RNA of different ori- 
gin. 3 - 4 Thus, activation and expression of (H)ERV may result 
in undesired mobilization of genetic material of retroviral ori- 



gin and may interfere with the safe production of therapeutic 
polypeptides, with safe human gene therapy and xenotrans- 
plantation. 5 

Cross-packaging of ERVs to a high level is observed in 
murine packaging cells commonly used for retroviral vector 
preparation. 6 Cross-packaged ERV transcripts may be trans- 
mitted to recipient cells leading to unwanted integration events, 
or may recombine with the vector forming new infectious retro- 
viruses. This is of high concern for the safety of retrovirus-me- 
diated human gene therapy. The risk of acquiring animal ERVs 
through xenotransplantation also requires attention since 
xenografts from baboons and pigs are currently discussed for 
human use. In vitro experiments indicate that xenotropic ERVs 
such as murine or cat retroviruses can propagate considerably 
in human cells. 7 Baboon endogenous retrovirus (BaEV) read- 
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ily infects human cells in culture. The same holds true for 
porcine ERV and several human cell types. 5 ' 8 " 10 Although these 
viruses do not show a pathogenic effect in their natural hosts, 
the situation may change when they are transferred to im- 
munosuppressed humans, in whom the virus might replicate to 
high titers. "' I2 

With respect to prospective practical applications we have 
established a universal detection assay that allows rapid testing 
of biological samples for undesired mobilization of retroviral 
sequences. With this method all known reverse transcriptase- 
related sequences of human and animal origin can be simulta- 
neously identified in a single two-step experiment. 



MATERIALS AND METHODS 

RNA preparation 

Total RNA was extracted from peripheral blood mononu- 
clear cells (PBMCs) of healthy blood donors according to a 
guanidinium isothiocyanate- cesium chloride (GlT/CsCl) ultra- 
centrifugation protocol 13 and dissolved in diethylpyrocarbo nate 
(DEPC)-treated distilled water. Consecutively, mRNA was pu- 
rified with Dynabeads paramagnetic particles as described by 



the manufacturer (Dynal, Hamburg, Germany). Nucleic acid 
concentrations were calculated by spectrophotom etry at 260 
nm. To check for genomic DNA contaminations 50 ng of each 
mRNA preparation was tested in a polymerase chain reaction 
(PCR) with mixed oligonucleotide (oligo) primers (MOPs) 
omitting the reverse transcription step. Only preparations neg- 
ative for DNA traces were used for PCR. Samples positive for 
DNA contamination were treated with RNase-free DNase (100 
units/pg; Roche Diagnostics, Mannheim, Germany) in 100 mM 
sodium acetate (pH 5.0), 5 mM MgS0 4 until control PCR was 
negative. 

Primers and reverse dot-blot oligonucleotides 

For PCR two different mixed oligonucleotide primer (MOP) 
sets, MOP-1 and MOP-2, have been designed. The primer se- 
quences correspond to highly conserved regions present in the 
reverse transcriptase (RT) genes of all known human endoge- 
nous and exogenous retroviruses, as well as related animal retro- 
viruses (Fig. 1). I4 " 1S MOP-1 primers preferentially amplify hu- 
man and mammalian type A, B, and D reverse transcriptase 
sequences, whereas MOP-2 primers allow the amplification of 
human and mammalian type C-related RT sequences as well as 
RT sequences of human exogenous retroviruses such as HIV, 




MOP-1 forward gaaggattcaragtnytdychcmrggh 

reverse gaaggatcc ktwddmkdtyatcmrvkwa 

MOP-2 forward gmggatcctkkammskvytrcyhcarggg 

reverse gaaggatccm dvhdrbmdkymayvyahkka 



FIG. 1. (A) Localization of conserved amino acid domains in the amino-terminal coding region of reverse transcriptases of retro- 
viruses. The core homology regions VLPQG and YM/V D DI/V/LL were used to design the mixed oligonucleotide primer sets 
MOP-1 and MOP-2. (B) Primer set MOP-1 was optimized for amplification of type A, B, and D retroviruses, whereas primer set 
MOP-2 was selected for favored priming of retrovirus type C-related templates as well as human exogenous retroviruses such as 
HIV, HTLV, HRV5, and foamy retroviruses. Standard single-letter abbreviations (IUPAC) are used to describe degenerate nu- 
cleotides. Both forward and reverse primers are oriented in the 5'-to-3' direction with respect to the DNA strand to be amplified. 
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HTLV, HRV5, and foamy viruses (Fig. 1). For each primer set 
a separate PCR was performed. The amplification products 
were then mixed in equimoiar amounts and used as probe in 
the reverse dot-blot hybridization. 

To design the oligonucleotides bound on the reverse dot-blot 
filters, databases were screened for RT-related sequences. RT 
sequences of exogenous and endogenous retroviruses were clas- 
sified according to the current nomenclature and further sub- 
grouped with respect to their degree of nucleotide homology 
(data not shown). Representative members of all retrovirus fam- 
ilies published so far were selected (Table 1) and the sequence 
information corresponding to the 90-bp stretch between the 
highly conserved RT motifs LPQG and YM/VDDI/V/LL' 6 
was used for synthesis of a pair of oligonucleotides, each 45 
nucleotides in length. Thus, each dot consists of an equimoiar 
mixture of two 45-mer oligonucleotides covering the internal 
sequence of the MOP amplicon. ' 

Reverse transcription and MOP PCR 

Five hundred nanograms of DNA-free mRNA preparations 
was reverse transcribed in a volume of 50 /J 1 containing 20 mM 
Tris-HCI (pH 8.4), 10 mM dithiothreitol (DTT), 50 mM KCl, 
2.5 mM MgCl 2 deoxynucleoside triphosphates (dNTPs; 0.5 mM 
each), 10 units of RNasin (Promega, Madison, WI), 30 pmol 
of random hexamer oligonucleotides (Promega), and 20 units 
of murine leukemia virus (MuLV) reverse transcriptase 
(GIBCO-BRL, Gaithersburg, MD) at 37*C for 1 hr. Consecu- 
tively, reverse-transcribed samples were denatured for 5 min at 
95"C and stored at -20'C. 

MOP amplification was carried out in a total volume of 50 
H\ containing 1/20 of the cDNA reactions, 10 mM Tris-HCI 
(pH 8.3), 50 mM KCl, 2.5 mM MgCl 2 , 0.001% gelatin, 50 pmol 
of each mixed oligonucleotide primer set, a 0.25 mM concen- 
tration of each deoxynucleoside triphosphate, and 1.25 units of 
Tag polymerase (GIBCO-BRL). Amplification was performed 
in a DNA thermal cycler (Perkin-EImerCetus, Emeryville, CA). 
Cycle parameters were as follows: 30 cycles of 94°C for 30 sec, 
50°C for 4 min, and 72'C for 1 min, followed by a final ex- 
tension step of 7 min at 72°C. We have chosen 50°C for the an- 
nealing step since it corresponds to the annealing temperature 
of the degenerate primers with the highest A-T ratio. A con- 
trol reaction in which the template was omitted was carried out 
to detect product carryover and any traces of contaminating ge- 
nomic DNA in the solutions used. Amplification products were 
analyzed on preparative 2.5% Tris-borate-EDTA (TBE) 
agarose gels and stained with ethidium bromide. Bands of in- 
terest with sizes between 100 and 150 bp corresponding to am- 
plified retroviral reverse transcriptase sequences were excised 
from the gel and purified with a Gene Clean II kit (Bio 101, 
Vista CA). For reverse dot-blot hybridization about 50 ng of 
the purified fragments was labeled with [cc- ,2 P]dATP (3000 
Ci/mmol), using a Megaprime DNA labeling kit (Amersham 
Pharmacia Biotech, Little Chalfont, England). 

Preparation of filter arrays 

Retrovirus-specif ic synthetic oligonucleotides corresponding 
to the 90-bp internal part of the amplified RT sequence were 
synthesized and high-performance liquid chromatography 
(HPLC) purified by Birsner & Grob-Biotech GmbH (Freiburg, 



Germany). For each retroviral sequence 100 pmol of a pair of 
45-mer oligonucleotides mixed in equimoiar amounts was di- 
luted in5x SSC (1 X SSC is 0.15 M NaCl plus 0.015 M sodium 
citrate) and spotted onto ZETAprobe GT blotting membranes 
(Bio-Rad, Hercules, CA), using a Minifold 1 dot blotter 
SRC96D (Schleicher & Schuell, Dassel, Germany). Filters were 
rinsed in 2X SSC and oligonucleotides were irreversibly im- 
mobilized by UV cross-linking (Stratalinker; Stratagene, La 
Jolla CA). Filters were allowed to air dry. 

Hybridizaton procedures 

Standardized hybridization conditions were as follows: Pre- 
hybridization of reverse dot-blot filters was performed within 
sealed plastic bags in 0.25 M Na 2 HP0 4 (pH 7.2), 1% sodium 
dodecyl sulfate (SDS), 1 mM EDTA at 50°C for at least 3 hr. 
For hybridization 5 X 10 5 cpm of labeled probe per milliliter 
of hybridization volume was added to the same solution and in- 
cubated for 16 hr under the same conditions. The membranes 
were then washed twice (30 min each) at 50'C in 40 mM 
Na 2 HPO„ (pH 7.2), 5% SDS, 1 mM EDTA and twice in 40 mM 
Na 2 HP0 4 (pH 7.2), 1% SDS, 1 mM EDTA, respectively. Fil- 
ter membranes were exposed to X-ray film (BioMax; Eastman 
Kodak, Rochester, NY). 



RESULTS 

Design of mixed oligo primers 

The pol genes of all retroviruses and most retroelem ents 
share highly conserved core homology regions. 15 ~ 17 Two of the 
most conserved amino acid regions are the VLPQG and YV/M 
DDI/V/LL motifs (Fig. 1). The spacing between both motifs is 
about 90 base pairs and this region shows considerably less ho- 
mology when compared between different retrovirus families. 
According to a general principle outlined by Shih et al. ]i we 
derived from these motifs universal PCR primers that allow am- 
plification of all known retroviral RT-related templates. After 
comparison of RT core homology regions of all human and 
mammalian endogenous and exogenous retroviral sequences 
available in the database, two sets of degenerate pol primers 
were designed. Primer set MOP-1 was optimized for amplifi- 
cation of type A, B, and D retroviruses, whereas primer set 
MOP-2 was selected for favored priming of retrovirus type C- 
related templates as well as human exogenous retrovirus such 
as HIV, HTLV, HRV5 and foamy retroviruses. A 9-base ex- 
tension featuring a clamp and a ScraHl restriction site was in- 
corporated at the 5' end of each primer. Since the sequence ex- 
tension has a stabilizing effect on primer-template binding 
kinetics, the products generated after the first PCR cycle are 
amplified more efficiently in the remaining cycles. Therefore, 
the amplification reaction can be considered as "multiplex" 
PCR under moderate primer-template annealing conditions. 
Retroviral templates in the reaction mixture can be amplified 
sufficiently with MOP-1 and MOP-2 primers even when the 
exactly matching primer is not available. Moreover, rapid prod- 
uct cloning for sequence verification or characterization of 
novel RT-related sequences is possible. PCR conditions were 
optimized with respect to the amount of primers, annealing 
time, and annealing temperature (data not shown). 



Table ]. Classification of Retrovirus-Specific Oligonucleotides and Dot Codes 



Retrovirus family Member 


Sequence 


Dot code 


Type B retroviruses HERV-K(HML-l) 


HML-1 (U35102) 


1A 




Seq29 (S77579) 


IB 


HERV-K(HML-2) 


HERV-K10 (M 14123) 


2A 




HERV clone M3.5 (U87592) 


2B 


HERV-K(HML-3) 


HML-3 (U35236) 


3A 




HERV1 (S66676) 


3B 




RT244 (S77583) 


3C 




Seq26 a 


3D 




Seq34 a 


4A 




Seq42 a 


4B 




Seq43 a 


4C 


HERV-K(HML-4) 


HERV-K-T47D (AF020092) 


5A 




Seq05 a 


5B 




SeqlO 0 


5C 


HERV-K(HML-5) 


HML-5 (U35161) 


6A . 


HBRV-K(HML-6) 


HML-6 (U60269) 


7A 




Seq38 a 


7B 




Seq56 a 


7C 


HERV-K(C4) 


HERV-K-C4 (U07856) 


8A 


Unassigned 


Seq31 a 

SeqU39937 (U39937) 


8B 
5F 


Type C retroviruses HERV-H 


SeqG46.2 (AF026252) 


2J, 2K" 




Seq61 a 


2K, 5J b 




Seq66° 


2L, 5K b 


ERV9/HERV-W. 


ERV9 (X57147) 


4E 




Seq49 a 


4F 




Seq59 a 


4G 




Seq60 a 


4H 




Seq63 a 


41 




Seq64 a 


4J 




HERV-W (AF009668) 


4L 


ERV-FRD 


ERV-FRD (U27240) 


3E 




Seq46 a 


3F 


HERV-ERI 


HERV-E(4-1) (M10976) 


2H 




Seq32 a 


21 


HERV-IP 


HERV-I (M92067) 


3H 




HERV-IP-T47D (U27241) 


31 




Seq65 a 


3J 


HERV-T 


S71 pCRTKI (U 12970) 


2E 




S71 pCRTK6 (U 12969) 


2F 


Type D retroviruses MPMV 


Seq36 a 


5H 


Foamy virus related 


HERV-L (G895836) 


IE 




Seq39 a 


IF 




Seq40 a 


■ 1G 




Seq45 a 


1H • 




Seq48 a 


11 




Seq51 a 


1J 




Seq58" 


IK 


Unassigned human 


Seq35 c 


5G 


retroviral elements 


Seq41 a 


51 




Seq77 a 


5J 


Human nonviral retroposon 


LINE-1 (M80343) 


3L 


Human exogenous 


HRV5 (U46939) 


6E 


retroviruses 


Foamy virus (Y 07725) 


6F 




HTLV-I C 


6G 




HTLV-II (Ml 0060) 


6H 




HIV-1 C 


61 




HIV-2 (J04542) 


6J 


Mammalian endogenous 


MMTV (M 15 122) 


7E 




PERV (AF038600) 


7F 




BaEV (D 10032) 


7G 




GaLV (M26927) 


7H 




Mo-MuLV (J02255) 


71 




MPMV (ML2349) 


7J 



"From Ref. 21. 

h FiIter code corresponding to Fig. 4 only. 
c From Ref. 22. 
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Reverse dot-blot hybridization 

In the second step, identification of the amplified products 
was performed by reverse dot-blot hybridization (RDBH) 
analysis (Fig. 2). This method is employed to sort unerringly 
all products of the PCR amplification. In contrast to the relaxed 
primer-template binding allowed during PCR amplification, 
RDBH enables the strict discrimination of PCR products and 
makes preceding false amplification of non-RT-related se- 
quences irrelevant. The high stringency of RDBH was achieved 
by the use of synthetic HERV-specific oligonucleotides spot- 
ted onto the filter membranes (Table 1). These oligonucleotides 
correspond to the pol sequences amplified with MOP-1 and 
MOP-2 primer sets, except that they lack the primer sequences 
themselves. Therefore, specificity of hybridization is due solely 
to the amplified sequence found between the described RTcore 
homology motifs. Thus, under high-stringency conditions the 
exact identification of even closely related retroviral sequences 
is possible. 

In this study we selected 61 retrovirus-specifi c oligonu- 
cleotides for RDBH analysis corresponding to pol genes of rep- 
resentative members of all known human exogenous and en- 
dogenous retrovirus families. In addition, six mammalian 
retroviruses were included. Origin and taxonomic classification 
of all sequences used are summarized in Table 1. 

HERV transcription pattern in human peripheral 
blood mononuclear cells 

To assess the feasibility and specificity of the PCR/RDBH 
assay system, HERV transcription was analyzed in human 
PBMCs (Fig. 3). When the MOP-1 primer set'alone was used 
for amplification, almost exclusively type B -related HERVs 
were detected, particularly members of the HERV-K subgroups 
HML-2, -3, -4, -6, and -C4 (Fig. 3A, dots 2A and 2B, dots 
3A-3D and 4A-4C, dots 5A-5C, dots 7A and 7B, and dots 8A 
and 8B) and a not yet classified HERV-K-related sequence (dot 
5F). This expression pattern concurs with previously published 
studies demonstrating a differential expression of HERV-K el- 
ements in human tissues. 18 '" No crosshybridization was ob- 
served with type C-related HERVs. Low amounts of products 
were obtained for one of the human foamy virus-related HERV- 
L elements (Fig. 3 A, dot IE). 

With MOP-2 primers HERV-E-related elements (Fig. 3B; 
dots 2H and 21), sequences of the HERV-L family (dots 
1E-1K), and ERV9-related HERVs (dots 4E-4G, and 41) were 
preferentially amplified. A certain amount of HERV-K(HML- 
4)- and HERV-K(HML-6)-related sequences was also present 
in the hybridization probe. Although the same amount of ra- 
dioactively labeled probes has been used in all hybridization re- 
actions, genomic control DNA (dots 8E-8H) gives much 
stronger signals with the MOP-2-amplified probe than with the 
MOP-1 probe, indicating that the human genome contains sig- 
nificantly more copies of type C-related than type B-related 
HERV elements. 

For detection of all retroviral sequences in a single experi- 
ment MOP-1 and MOP-2 primers were first added in an 
equimolar ratio to the PCR. However, this experiment resulted 
in a predominant amplification of type C-related sequences, the 
ABD-type sequences being underrepresented (data not shown). 
Therefore, we performed separate PCR with either MOP-1 or 



MOP-2 primer sets, and mixed the purified amplification prod- 
ucts of both reactions in equimolar amounts. This procedure re- 
sulted in a signal pattern that would have been expected when 
combining both primer sets (Fig. 3C) and corresponds roughly 
to the amount of type B- and type C-related HERV transcripts 
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FIG. 4. HERV expression pattern in T47D cells after steroid 
treatment (A) and HERV transcripts packaged in T47D parti- 
cles (B). DNA probes for reverse dot-blot hybridization were 
generated by reverse transcription of mRNA isolated from 
steroid-treated T47D cells (A) and from T47D particles (B), re- 
spectively. 3 For fast assignment of retrovirus-specific oligonu- 
cleotides compare with (C); for origin and exact identification 
see Table 1. 
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determined previously in PBMCs by Northern analysis. 20 It is, 
however, important to note that the PCR/RDBH assay is a qual- 
itative test, and cannot be quantified because of the use of highly 
degenerate primers and cross-hybridizatio n of closely related 
sequences. 

Detection and identification o/pol sequences in 
retroviral particles 

To evaluate the PCR/RDBH assay with respect to further 
practical applications, e.g., detection of small amounts of retro- 
viral particles in preparations of therapeutic proteins, or co- 
packaging of endogenous retroviral sequences in vector isolates 
for human gene therapy, the PCR/RDBH assay was used for 
analysis of retroviral particles produced by the human mam- 
mary carcinoma cell line T47D. We have shown previously that 
particles released from this cell line after induction with steroids 
are pseudotypes and may package retroviral RNAs of different 
origins. 3 ' 4 T47D mRNA from steroid-treated cells (Fig. 4A) as 
well as particle preparations from corresponding cell culture su- 
pernatants (Fig. 4B) were therefore subjected to the PCR/RDBH 
analysis. Comparison of HERV transcripts expressed in T47D 
cells with HERV RNAs contained in T47D particles revealed 
that, despite a high number of transcriptionally active HERV 
elements in steroid-treated T47D Cells, only three retroviral 
RNAs are packaged in the pseudotype particles. Transcripts of 
the HERV-K-subgroups HML-4 (Fig. 4B, dots 5A-C) and 
HML-6 (Fig. 4B, dot 7A) as well as type C-related ERV-FRD 
elements (Fig.. 4B, dots 3E and 3F) were found to have accu- 
mulated preferentially in T47D particle preparations, whereas 
the retroviral mRNA of T47D cells contains in addition HERV- 
K(HML-l) (Fig. 4A, dot IB), HERV-K(HML-3) (Fig. 4A, dots 
3A-3D and 4A-4C), HERV-1 (Fig. 4A, dot 3H), and HERV- 
H transcripts (Fig. 4A, dots 2K, 51, and 5K) as main compo- 
nents. These results demonstrate the high sensitivity of the 
PCR/RDBH assay, since T47D cells produce only low amounts 
of particles not detectable by conventional methods, 3 and sug- 
gest a high practical value for monitoring vector preparations 
to be used in human gene therapy. 



Sensitivity and species specificity 

To test further the sensitivity of the PCR/RDBH assay with 
respect to prospective practical use, e.g., monitoring potential 
transmission of porcine endogenous retroviruses (PERVs) via 
xenotransplantatio n, we performed spiking experiments with 
human PBMC-derived cDNA that contained serial dilutions of 
a cloned DNA fragment from the PERV-A pol region. 10 Under 
standardized test conditions as few as 1 0 copies of PERV DNA 
were detectable in cDNA derived from 25 ng of human PBMC 
mRNA (Fig. 5A, dot 7F). No cross-hybridization of human- 
specific amplification products with the PERV-specific 
oligonucleotides on the filter was observed (Fig. 3C, dot 7F). 
Vice versa, when the PCR/RDBH test was performed with 
MOP-amplified pig genomic DNA as hybridization probe, no 
cross-hybridizatio n of pig PCR products with human endoge- 
nous or exogenous retroviral sequences was detected (Fig. 5B). 
These results demonstrate the high species specificity of the 
PCR/RDBH assay, ruling out the possibility of pig-human 
retroviral interspecies cross-hybridization. 

Interestingly, porcine genomic DNA seems to contain as yet 
unidentified PERVs that show a higher homology to murine 
leukemia viruses than PERVs A, B, and C, since a distinct sig- 
nal was found with Moloney MuLV (Mo-MuLV) (Fig. 5B,dot 
71). A weak cross-hybridization with human genomic DNA 
(Fig. 5B, dots 8E and 8F) indicates that the human genome may 
also contain some PERV-related sequences that have not yet 
been characterized and therefore are not represented on the 
HERV dot blot. 



DISCUSSION 

We have established a universal PCR/RDBH detection as- 
say that allows detection as well as identification of all known 
retroviral RT-related sequences in a sample in a single experi- 
ment. The assay combines a PCR using two sets of highly de- 
generate primers and hybridization employing an array of im- 
mobilized synthetic retrovirus-specific oligonucleotides. For 
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FIG. 5. HERV expression in human PBMCs after mixing human cDNA with a' cloned DNA fragment containing PERV RT. 
Fewer than 10 copies of pig endogenous retrovirus type A DNA 5 - 10 can be detected and identified under the standardized test 
conditions (A, dot 7F). No cross-hybridization of procine amplification products generated from a pig DNA template can be ob- 
served with HERVs under the applied stringency conditions (B). For fast assignment of retrovirus-specific oligonucleotides com- 
pare with (D) of Fig. 3; for origin and exact identification see Table 1. 



728 



SEIFARTH ET AL. 



primer design two of the most highly conserved motifs within 
the reverse transcriptase-enco ding region of the pol gene were 
exploited. The usefulness of these conserved motifs for detec- 
tion of novel retroviruses has been demonstrated in pioneering 
work by several authors. 15 ' 15 ' 21 ' 22 In the first step of our ex- 
perimental approach, which can be considered as "multiplex" 
PCR under moderate primer-template annealing conditions, all 
retroviral templates of the sample investigated are amplified. In 
the second step, exact sorting of the amplified products is per- 
formed by RDBH under high-stringency conditions. With this 
highly sensitive and species-specific diagnostic tool we could 
detect as few as 10 PERV DNA copies contained in cDNA de- 
rived from 25 ng of human PBMC mRNA. 

However, it is important to emphasize that the PCR/RDBH 
assay is primarily a qualitative detection technique. Although 
the distinct intensity of the autoradiograph signals may give a 
strong impression of a quantitative monitoring of retrovirus ex- 
pression, it is worth noting that several parameters of uncer- 
tainty may lead to a signal pattern that differs from the true ex- 
pression rates in the sample. The use of highly degenerate 
primer sets combined with relaxed primer-template binding 
conditions may lead to preferential amplification of certain 
"high copy" or a few "best fit" templates, whereas others stay 
underrepresented. This effect increases with the number of PCR 
cycles performed and becomes critical above 35 cycles. Thus, 
no more than 30 rounds of PCR should be performed. The mul- 
tiplex PCR does not allow an internal standardization except 
for overall hybridization efficacy and autoradiograph exposure 
time. With PCR/RDBH identified RT related transcripts must 
be quantified in further experiments by conventional methods 
such as Northern blotting or by a specific competitive PCR es- 
tablished for the retroviral sequence of interest. 

On the other hand, it is an advantageous feature of the 
PCR/RDBH assay that, because, of the highly degenerate 
primers and cross-hybridization by lowering the stringency of 
hybridization conditions, it allows isolation and characteriza- 
tion of yet unknown retroviral sequences. DNA hybridizing to 
the covalently bound oligonucleotides can be eluted from the 
filter membrane by alkaline denaturation and reamplified to pro- 
vide sufficient double-stranded DNA for cloning and subse- 
quent sequence analysis. 

With the employment of nonradioactive labeling techniques 
the PCR/RDBH assay offers the possibility of an automatable 
procedure for rapid analysis of retroviral expression. DNA chip 
technology may be applied, facilitating handling and increas- 
ing efficacy of reverse dot-blot filter membranes. Computer-as- 
sisted evaluation of RDBH results by phospho/ fluorescence- 
imaging systems may further improve visualization. It is one 
of the advantageous features of the PCR/RDBH assay system 
that the test is unlimited with respect to number and origin of 
retroviral RT sequences to be tested. Novel RT-encoding se- 
quences can be easily added to the filter arrays. Modifications 
in the experimental design are not necessary. Moreover, by 
modifying the hybridization conditions PCR/RDBH can be used 
to search for new exo- or endogenous retroviruses with only 
weak homologies to already known families. 

In summary, the PCR7RDBH assay is a powerful technique for 
precise qualitative analysis of retrovirus activity in biological sam- 
ples. Number and types of retroviral sequences to be identified 
are determined solely by number and types of synthetic oligonu- 



cleotides spotted onto reverse dot-blot membranes. PCR/RDBH 
could be useful in guarding patients against undesired transmis- 
sion of genetic material by retroviruses from therapeutic protein 
preparations, in gene therapy and xenotransplantation. 



A CKNO WLEDGMENTS 

We are grateful to Martin Herrmann (Institut fur Klinische 
Immunologic und Rheum atologie, Friedrich-Alexander-Uni- 
verstitat Erlangen-NUmberg, Germany) for providing unpub- 
lished HERV sequence data. We further thank Dr. A, Arthur- 
Goettig for critically reading the manuscript. This work was 
supported in part by the German BMBF, Grant 01 GB9403, and 
by the Forschungsfonds of the Faculty of Clinical Medicine 
Mannheim, University of Heidelberg. 



REFERENCES 

1. Wilkinson DA, Mager DL, and Leong JC: Endogenous human 
retroviruses. In: The Retroviridae (Levy JA, ed.), Vol. 3. Plenum 
Press, New York, 1994, pp. 465-535. 

2. Leib-Mosch C and Seifarth W: Evolution and biological signifi- 
• cance of human retroelements. Virus Genes 1996;1 1:133- 145. 

3. Seifarth W, Skladny H, Krieg-Schneider F, Reichcrl A, Hehlmann 
R, and Leib-Mosch C: Reirovirus-like particles released from Ihe 
human breast cancer cell line T47-D display B- and C-type related 
endogenous retroviral sequences. J Virol 1995;69:6408-6416. 

4. Seifarth W, Baust C, Murr A, Skladny H, Krieg-Schneider F, 
Blusch J, Werner T, Hehlmann R, and Leib-M6sch C: Proviral 
structure, chromosaomal location and expression of HERV-K- 
T47D: A novel human endogenous retrovirus derived from T47D 
particles. J Virol 1998;72:8384-8391. 

5. Patience C, Takeuchi Y, and Weiss RA: Infection of human cells 
by an endogenous retrovirus of pigs. Nature Med 1997;3:282-286. 

6. Patience C, Takeuchi Y, Cosset PL, and Weiss RA: Packaging of 
endogenous retroviral sequences in retroviral vectors produced by 
murine and human packaging cells. J Virol 1998,72:2671-2676. 

7. Gtinzburg WH and Salmons B: Development of retroviral vectors 
as safe, targeted gene delivery systems. J Mol Med 1996; 74:171- 
182. 

8. Martin U, Kiessig V, Blusch JH, Haverich A, van der Helm K, 
Herden T, and Steinhoff G: Expression of pig endogenous retro- 
virus by primary porcine endothelial cells and infection of human 
cells. Lancet 1998;352:692-694. 

9 Wilson CA, Wong S, Muller J, Davidson CE, Rose 'I'M, and Burd 
P: Type C retrovirus released from porcine primary peripheral 
blood mononuclear cells infects human cells. J Virol 1998;72: 
3082-3087. 

10 Takeuchi Y, Patience C, Magre S, Weiss RA, Banerjee PT, Le 
Tissier P, and Stoye JP: Host range and interference studies of three 
classes of pig endogenous retrovirus. J Virol 1998;72:9986-9991. 

11. Bach FH and Fineberg HV: Call for a moratorium on xenotrans- 
plants. Nature (London) 1998;391:326. 

12. Stoye JP and Coffin JM: The dangers of xenotransplantation. Na- 
ture Med 1995;1:1100. 

13. Sambrook J, Fritsch EF, and Maniatis T: Molecular Cloning: A 
Laboratory Manual, 2nd Ed. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York, 1989. 

14. Xiong Y and Eickbush TH: Origin and evolution of retroelements 
based upon their reverse transcriptase sequences. EMBO J 
1990;9:3353-3362. 



REVERSE TRANSCRIPTASE DETECTION ASSAY 



729 



15. Shih A, Misra R, and Rush MG: Detection of multiple, novel re- 
■ verse transcriptase coding sequences in human nucleic acids: Re- 
lation to primate retroviruses. J Virol 1989;62:64-75. 

16 Donehower LA, Bohannon RC, Ford RJ, and Gibbs RA: The use 
of primers from highly conserved pal regions to identify unchar- 
acterized retroviruses by the polymerase chain reaction. J Virol 
Methods 1990;28:33-46. 

17. McClure MA, In Skalka AM, and Goff SP (eds.): Evolutionary 
History of Reverse Transcriptase. Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, New York, 1993: pp. 425-444. 

18. Mestrand P, Lindeskog M, and Blomberg J: Expression of human 
endogenous retroviral sequences in peripheral blood mononuclear 
cells of healthy individuals. J Gen Virol 1992;73:2463-2466. 

19. Andersson ML, Medstrand P, Yin H, and Blomberg J: Differential 
expression of human endogenous retroviral sequences similar to 
mouse mammary tumor virus in normal peripheral blood mononu- 
clear cells. AIDS Res Hum Retroviruses 1996;12:833-840. 

20. Krieg AM, Gourley MF, Klinman DM, Perl A, and Steinberg AD: 
Heterogenous expression and coordinate regulation of endogenous 



retroviral sequences in human peripheral blood mononuclear cells, 
AIDS Res Hum Retroviruses 1992;8:1991-1998. 

21. Herrmann M, and Kalden JR: PCR and reverse dol blot hy- 
bridization for detection of endogenous retroviral transcripts. J Vi- 
rol Methods 1994;46:333-348. 

22. Tuke PW, Perron H, Bedin F, Beseme F, and Garson JA: Devel- 
opment of a pan-retrovirus detection system for multiple sclerosis 
studies. Acta Neurol Scand !997;169(Suppl.) : 1 6—2 1 . 

Address reprint requests to: 
Wolfgang Seifarth 
III. Medhmische Klinik 
Klinikum Mannheim der Universitcit Heidelberg 
Wiesbadener Str. 7-11 
D-68305 Mannheim, Germany 

E-mail: seifarth® rumms.uni-mannheini .de 



